Tom Lane [Mon, 2 May 2011 16:08:08 +0000 (12:08 -0400)]
Improve aset.c's space management in contexts with small maxBlockSize.
The previous coding would allow requests up to half of maxBlockSize to be
treated as "chunks", but when that actually did happen, we'd waste nearly
half of the space in the malloc block containing the chunk, if no smaller
requests came along to fill it. Avoid this scenario by limiting the
maximum size of a chunk to 1/8th maxBlockSize, so that we can waste no more
than 1/8th of the allocated space. This will not change the behavior at
all for the default context size parameters (with large maxBlockSize),
but it will change the behavior when using ALLOCSET_SMALL_MAXSIZE.
In particular, there's no longer a need for spell.c to be overly concerned
about the request size parameters it uses, so remove a rather unhelpful
comment about that.
Rewrite installation makefile rules without for loops
install-sh can install multiple files at once, so for loops are not
necessary. This was already changed for the rest of the code some
time ago, but pgxs.mk was apparently forgotten, and the obsolete
coding style has now been copied to the PLs as well.
This also fixes the problem that the for loops in question did not
catch errors.
Tom Lane [Sun, 1 May 2011 21:57:33 +0000 (17:57 -0400)]
Make CLUSTER lock the old table's toast table before copying data.
We must lock out autovacuuming of the old toast table before computing the
OldestXmin horizon we will use. Otherwise, autovacuum could start on the
toast table later, compute a later OldestXmin horizon, and remove as DEAD
toast tuples that we still need (because we think their parent tuples are
only RECENTLY_DEAD). Per further thought about bug #5998.
Tom Lane [Fri, 29 Apr 2011 20:29:42 +0000 (16:29 -0400)]
Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().
VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it. The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots. However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain). This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible. But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table. It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.
Easiest fix is to get rid of the special case for xmin == xmax. This may
delay reclaiming dead space for a little bit in some cases, but it's by far
the most reliable way to fix the issue.
Per bug #5998 from Mark Reid. Back-patch to 8.3, which is the oldest
version with MVCC-safe CLUSTER.
Tom Lane [Fri, 29 Apr 2011 05:44:50 +0000 (01:44 -0400)]
Rewrite pg_size_pretty() to avoid compiler bug.
Convert it to use successive shifts right instead of increasing a divisor.
This is probably a tad more efficient than the original coding, and it's
nicer-looking than the previous patch because we don't need a special case
to avoid overflow in the last branch. But the real reason to do it is to
avoid a Solaris compiler bug, as per results from buildfarm member moa.
Andrew Dunstan [Thu, 28 Apr 2011 23:58:49 +0000 (19:58 -0400)]
Use non-literal format for possibly non-standard strftime formats.
Per recent -hackers discussion. The formats in question are %G and %V,
and cause warnings on MinGW at least. We assume the ecpg application
knows what it's doing if it passes these formats to the library.
Andrew Dunstan [Thu, 28 Apr 2011 14:56:14 +0000 (10:56 -0400)]
Use a macro variable PG_PRINTF_ATTRIBUTE for the style used for checking printf type functions.
The style is set to "printf" for backwards compatibility everywhere except
on Windows, where it is set to "gnu_printf", which eliminates hundreds of
false error messages from modern versions of gcc arising from %m and %ll{d,u}
formats.
Peter Eisentraut [Wed, 27 Apr 2011 19:08:22 +0000 (22:08 +0300)]
Fix binary upgrade of altered typed tables
Instead of dumping them as CREATE TABLE ... OF, dump them as normal
tables with the usual special processing for dropped columns, and then
attach them to the type afterward, using ALTER TABLE ... OF. This is
analogous to the existing handling of inherited tables.
Magnus Hagander [Wed, 27 Apr 2011 18:39:20 +0000 (20:39 +0200)]
timeline is not needed in BaseBackup()
This code was accidentally part of the patch, it's only
needed for the code that's for 9.2. Not needing the timeline
also removes the need to call IDENTIFY_SYSTEM.
Tom Lane [Wed, 27 Apr 2011 17:58:36 +0000 (13:58 -0400)]
Fix array- and path-creating functions to ensure padding bytes are zeroes.
Per recent discussion, it's important for all computed datums (not only the
results of input functions) to not contain any ill-defined (uninitialized)
bits. Failing to ensure that can result in equal() reporting that
semantically indistinguishable Consts are not equal, which in turn leads to
bizarre and undesirable planner behavior, such as in a recent example from
David Johnston. We might eventually try to fix this in a general manner by
allowing datatypes to define identity-testing functions, but for now the
path of least resistance is to expect datatypes to force all unused bits
into consistent states.
Per some testing by Noah Misch, array and path functions seem to be the
only ones presenting risks at the moment, so I looked through all the
functions in adt/array*.c and geo_ops.c and fixed them as necessary. In
the array functions, the easiest/safest fix is to allocate result arrays
with palloc0 instead of palloc. Possibly in future someone will want to
look into whether we can just zero the padding bytes, but that looks too
complex for a back-patchable fix. In the path functions, we already had a
precedent in path_in for just zeroing the one known pad field, so duplicate
that code as needed.
Tom Lane [Tue, 26 Apr 2011 19:56:28 +0000 (15:56 -0400)]
Rephrase some not-supported error messages in pg_hba.conf processing.
In a couple of places we said "not supported on this platform" for cases
that aren't really platform-specific, but could depend on configuration
options such as --with-openssl. Use "not supported by this build" instead,
as that doesn't convey the impression that you can't fix it without moving
to another OS; that's also more consistent with the wording used for an
identical error case in guc.c.
No back-patch, as the clarity gain is small enough to not be worth
burdening translators with back-branch changes.
Tom Lane [Tue, 26 Apr 2011 19:40:11 +0000 (15:40 -0400)]
Complain if pg_hba.conf contains "hostssl" but SSL is disabled.
Most commenters agreed that this is more friendly than silently failing
to match the line during actual connection attempts. Also, this will
prevent corner cases that might arise when trying to handle such a line
when the SSL code isn't turned on. An example is that specifying
clientcert=1 in such a line would formerly result in a completely
misleading complaint that root.crt wasn't present, as seen in a recent
report from Marc-Andre Laverdiere. While we could have instead fixed
that specific behavior, it seems likely that we'd have a continuing stream
of such bizarre behaviors if we keep on allowing hostssl lines when SSL is
disabled.
Back-patch to 8.4, where clientcert was introduced. Earlier versions don't
have this specific issue, and the code is enough different to make this
patch not applicable without more work than it seems worth.
Bruce Momjian [Tue, 26 Apr 2011 00:17:48 +0000 (20:17 -0400)]
In pg_upgrade, avoid one start/stop of the postmaster; use the -w
(wait) flag for pg_ctl start/stop; remove the unused "quiet" flag in
the functions for starting/stopping the postmaster.
Tom Lane [Tue, 26 Apr 2011 00:13:53 +0000 (20:13 -0400)]
Remove incorrect HINT for use of ALTER FOREIGN TABLE on the wrong relkind.
Per discussion, removing the hint seems better than correcting it because
the adjacent analogous cases in RenameRelation don't have any hints, and
nobody seems to have missed 'em.
Robert Haas [Mon, 25 Apr 2011 20:55:11 +0000 (16:55 -0400)]
Refactor broken CREATE TABLE IF NOT EXISTS support.
Per bug #5988, reported by Marko Tiikkaja, and further analyzed by Tom
Lane, the previous coding was broken in several respects: even if the
target table already existed, a subsequent CREATE TABLE IF NOT EXISTS
might try to add additional constraints or sequences-for-serial
specified in the new CREATE TABLE statement.
In passing, this also fixes a minor information leak: it's no longer
possible to figure out whether a schema to which you don't have CREATE
access contains a sequence named like "x_y_seq" by attempting to create a
table in that schema called "x" with a serial column called "y".
Some more refactoring of this code in the future might be warranted,
but that will need to wait for a later major release.
Robert Haas [Mon, 25 Apr 2011 20:34:57 +0000 (16:34 -0400)]
Remove partial and undocumented GRANT .. FOREIGN TABLE support.
Instead, foreign tables are treated just like views: permissions can
be granted using GRANT privilege ON [TABLE] foreign_table_name TO role,
and revoked similarly. GRANT/REVOKE .. FOREIGN TABLE is no longer
supported, just as we don't support GRANT/REVOKE .. VIEW. The set of
accepted permissions for foreign tables is now identical to the set for
regular tables, and views.
Per report from Thom Brown, and subsequent discussion.
Tom Lane [Mon, 25 Apr 2011 20:22:12 +0000 (16:22 -0400)]
Fix pg_size_pretty() to avoid overflow for inputs close to INT64_MAX.
The expression that tried to round the value to the nearest TB could
overflow, leading to bogus output as reported in bug #5993 from Nicola
Cossu. This isn't likely to ever happen in the intended usage of the
function (if it could, we'd be needing to use a wider datatype instead);
but it's not hard to give the expected output, so let's do so.
Peter Eisentraut [Mon, 25 Apr 2011 19:21:37 +0000 (22:21 +0300)]
Support "make check" in contrib
Added a new option --extra-install to pg_regress to arrange installing
the respective contrib directory into the temporary installation.
This is currently not yet supported for Windows MSVC builds.
Updated the .gitignore files for contrib modules to ignore the
leftovers of a temp-install check run.
Changed the exit status of "make check" in a pgxs build (which still
does nothing) to 0 from 1.
Added "make check" in contrib to top-level "make check-world".
Andrew Dunstan [Mon, 25 Apr 2011 16:46:59 +0000 (12:46 -0400)]
Prevent perl header overriding our *snprintf macros, and give it a usable PERL_UNUSED_DECL value.
This quiets compiler warnings about redefined macros and unused Perl__unused variables. The
redefinition of snprintf and vsnprintf is something we want to avoid anyway, if we've
gone to the bother of setting up the macros to point to our implementation.
Bruce Momjian [Mon, 25 Apr 2011 16:00:21 +0000 (12:00 -0400)]
Add postmaster/postgres undocumented -b option for binary upgrades.
This option turns off autovacuum, prevents non-super-user connections,
and enables oid setting hooks in the backend. The code continues to use
the old autoavacuum disable settings for servers with earlier catalog
versions.
This includes a catalog version bump to identify servers that support
the -b option.
Andrew Dunstan [Mon, 25 Apr 2011 13:10:59 +0000 (09:10 -0400)]
Adjust yywrap macro for non-reentrant scanners for MSVC.
The MSVC compiler complains if a macro is called with less arguments
than its definition provides for. flex generates a macro with one
argument for yywrap, but only supplies the argument for reentrant
scanners, so we remove the useless argument in the non-reentrant
case to silence the warning.
Peter Eisentraut [Sun, 24 Apr 2011 22:12:16 +0000 (01:12 +0300)]
Normalize whitespace in the arguments of <indexterm>
Strip leading and trailing whitespace and replace interior whitespace
by a single space. This avoids problems with the index generator
producing duplicate index entries for terms that differ only in
whitespace.
Commit dca30da3433c40b5f92f1704c496cda052decef9 actually fixed all the
indexterm elements that would cause this problem at the moment, but in
case it sneaks in again, we're set.
Tom Lane [Sun, 24 Apr 2011 20:55:20 +0000 (16:55 -0400)]
Improve cost estimation for aggregates and window functions.
The previous coding failed to account properly for the costs of evaluating
the input expressions of aggregates and window functions, as seen in a
recent gripe from Claudio Freire. (I said at the time that it wasn't
counting these costs at all; but on closer inspection, it was effectively
charging these costs once per output tuple. That is completely wrong for
aggregates, and not exactly right for window functions either.)
There was also a hard-wired assumption that aggregates and window functions
had procost 1.0, which is now fixed to respect the actual cataloged costs.
The costing of WindowAgg is still pretty bogus, since it doesn't try to
estimate the effects of spilling data to disk, but that seems like a
separate issue.
Tom Lane [Sat, 23 Apr 2011 23:33:04 +0000 (19:33 -0400)]
Improve findoidjoins to cover more cases.
Teach the program and script to deal with OID-array referencing columns,
which we now have several of. Also, modify the recommended usage process
to specify that the program should be run against the regression database
rather than template1. This lets it find numerous joins that cannot be
found in template1 because the relevant catalogs are entirely empty.
Together these changes add seventeen formerly-missed cases to the oidjoins
regression test.
Tom Lane [Sat, 23 Apr 2011 16:51:47 +0000 (12:51 -0400)]
Adjust comments about collate.linux.utf8 regression test.
This test should now work in any database with UTF8 encoding, regardless
of the database's default locale. The former restriction was really
"doesn't work if default locale is C", and that was because of not handling
mbstowcs/wcstombs correctly.
Tom Lane [Sat, 23 Apr 2011 16:35:41 +0000 (12:35 -0400)]
Fix char2wchar/wchar2char to support collations properly.
These functions should take a pg_locale_t, not a collation OID, and should
call mbstowcs_l/wcstombs_l where available. Where those functions are not
available, temporarily select the correct locale with uselocale().
This change removes the bogus assumption that all locales selectable in
a given database have the same wide-character conversion method; in
particular, the collate.linux.utf8 regression test now passes with
LC_CTYPE=C, so long as the database encoding is UTF8.
I decided to move the char2wchar/wchar2char functions out of mbutils.c and
into pg_locale.c, because they work on wchar_t not pg_wchar_t and thus
don't really belong with the mbutils.c functions. Keeping them where they
were would have required importing pg_locale_t into pg_wchar.h somehow,
which did not seem like a good plan.
Tom Lane [Sat, 23 Apr 2011 00:19:58 +0000 (20:19 -0400)]
Fix contrib/btree_gist to handle collations properly.
Make use of the collation attached to the index column, instead of
hard-wiring DEFAULT_COLLATION_OID. (Note: in theory this could require
reindexing btree_gist indexes on textual columns, but I rather doubt anyone
has one with a non-default declared collation as yet.)
Tom Lane [Sat, 23 Apr 2011 00:13:12 +0000 (20:13 -0400)]
Make GIN and GIST pass the index collation to all their support functions.
Experimentation with contrib/btree_gist shows that the majority of the GIST
support functions potentially need collation information. Safest policy
seems to be to pass it to all of them, instead of making assumptions about
which ones could possibly need it.
Tom Lane [Fri, 22 Apr 2011 22:22:38 +0000 (18:22 -0400)]
De-kludge contrib/btree_gin for collations.
Using DEFAULT_COLLATION_OID in the comparePartial functions was not only
a lame hack, but outright wrong, because the compare functions for
collation-aware types were already responding to the declared index
collation. So comparePartial would have the wrong expectation about
the index's sort order, possibly leading to missing matches for prefix
searches.
Peter Eisentraut [Fri, 22 Apr 2011 21:44:45 +0000 (00:44 +0300)]
Small update to emacs example configuration
Since both tarballs and git now result in a "postgresql" directory
rather than a "pgsql" directory, adjust the example configuration to
look for the former.
Tom Lane [Fri, 22 Apr 2011 21:43:18 +0000 (17:43 -0400)]
Make a code-cleanup pass over the collations patch.
This patch is almost entirely cosmetic --- mostly cleaning up a lot of
neglected comments, and fixing code layout problems in places where the
patch made lines too long and then pgindent did weird things with that.
I did find a bug-of-omission in equalTupleDescs().
Robert Haas [Thu, 21 Apr 2011 01:35:15 +0000 (21:35 -0400)]
Allow ALTER TABLE name {OF type | NOT OF}.
This syntax allows a standalone table to be made into a typed table,
or a typed table to be made standalone. This is possibly a mildly
useful feature in its own right, but the real motivation for this
change is that we need it to make pg_upgrade work with typed tables.
This doesn't actually fix that problem, but it's necessary
infrastructure.
Tom Lane [Thu, 21 Apr 2011 00:34:11 +0000 (20:34 -0400)]
Fix bugs in indexing of in-doubt HOT-updated tuples.
If we find a DELETE_IN_PROGRESS HOT-updated tuple, it is impossible to know
whether to index it or not except by waiting to see if the deleting
transaction commits. If it doesn't, the tuple might again be LIVE, meaning
we have to index it. So wait and recheck in that case.
Also, we must not rely on ii_BrokenHotChain to decide that it's possible to
omit tuples from the index. That could result in omitting tuples that we
need, particularly in view of yesterday's fixes to not necessarily set
indcheckxmin (but it's broken even without that, as per my analysis today).
Since this is just an extremely marginal performance optimization, dropping
the test shouldn't hurt.
These cases are only expected to happen in system catalogs (they're
possible there due to early release of RowExclusiveLock in most
catalog-update code paths). Since reindexing of a system catalog isn't a
particularly performance-critical operation anyway, there's no real need to
be concerned about possible performance degradation from these changes.
The worst aspects of this bug were introduced in 9.0 --- 8.x will always
wait out a DELETE_IN_PROGRESS tuple. But I think dropping index entries
on the strength of ii_BrokenHotChain is dangerous even without that, so
back-patch removal of that optimization to 8.3 and 8.4.
Tom Lane [Wed, 20 Apr 2011 23:01:20 +0000 (19:01 -0400)]
Set indcheckxmin true when REINDEX fixes an invalid or not-ready index.
Per comment from Greg Stark, it's less clear that HOT chains don't conflict
with the index than it would be for a valid index. So, let's preserve the
former behavior that indcheckxmin does get set when there are
potentially-broken HOT chains in this case. This change does not cause any
pg_index update that wouldn't have happened anyway, so we're not
re-introducing the previous bug with pg_index updates, and surely the case
is not significant from a performance standpoint; so let's be as
conservative as possible.
Tom Lane [Wed, 20 Apr 2011 21:43:34 +0000 (17:43 -0400)]
Make plan_cluster_use_sort cope with no IndexOptInfo for the target index.
The original coding assumed that such a case represents caller error, but
actually get_relation_info will omit generating an IndexOptInfo for any
index it thinks is unsafe to use. Therefore, handle this case by returning
"true" to indicate that a seqscan-and-sort is the preferred way to
implement the CLUSTER operation. New bug in 9.1, no backpatch needed.
Per bug #5985 from Daniel Grace.
Peter Eisentraut [Wed, 20 Apr 2011 20:19:04 +0000 (23:19 +0300)]
Fix PL/Python traceback for error in separate file
It assumed that the lineno from the traceback always refers to the
PL/Python function. If you created a PL/Python function that imports
some code, runs it, and that code raises an exception, PLy_traceback
would get utterly confused.
Now we look at the file name reported with the traceback and only
print the source line if it came from the PL/Python function.
Quotes in strings injected into bki file need to escaped. In particular,
"People's Republic of China" locale on Windows was causing initdb to fail.
This fixes bug #5818 reported by yulei. On master, this makes the mapping
of "People's Republic of China" to just "China" obsolete. In 9.0 and 8.4,
just fix the escaping. Earlier versions didn't have locale names in bki
file.
Bruce Momjian [Wed, 20 Apr 2011 01:00:29 +0000 (21:00 -0400)]
Throw error for mismatched pg_upgrade clusters
If someone removes the 'postgres' database from the old cluster and the
new cluster has a 'postgres' database, the number of databases will not
match. We actually could upgrade such a setup, but it would violate the
1-to-1 mapping of database counts, so we throw an error instead.
Previously they got an error during the upgrade, and not at the check
stage; PG 9.0.4 does the same.
Tom Lane [Tue, 19 Apr 2011 22:50:56 +0000 (18:50 -0400)]
Avoid changing an index's indcheckxmin horizon during REINDEX.
There can never be a need to push the indcheckxmin horizon forward, since
any HOT chains that are actually broken with respect to the index must
pre-date its original creation. So we can just avoid changing pg_index
altogether during a REINDEX operation.
This offers a cleaner solution than my previous patch for the problem
found a few days ago that we mustn't try to update pg_index while we are
reindexing it. System catalog indexes will always be created with
indcheckxmin = false during initdb, and with this modified code we should
never try to change their pg_index entries. This avoids special-casing
system catalogs as the former patch did, and should provide a performance
benefit for many cases where REINDEX formerly caused an index to be
considered unusable for a short time.
Back-patch to 8.3 to cover all versions containing HOT. Note that this
patch changes the API for index_build(), but I believe it is unlikely that
any add-on code is calling that directly.
Peter Eisentraut [Tue, 19 Apr 2011 19:52:52 +0000 (22:52 +0300)]
Refix the unaccent regression test on MSVC properly
... for some value of "properly". Instead of overriding REGRESS_OPTS,
set the variables ENCODING and NO_LOCALE, which is more expressive and
allows overriding by the user. Fix vcregress.pl to handle that.
Tom Lane [Tue, 19 Apr 2011 16:17:13 +0000 (12:17 -0400)]
Refrain from canonicalizing a client_encoding setting of "UNICODE".
While "UTF8" is the correct name for this encoding, existing JDBC drivers
expect that if they send "UNICODE" it will read back the same way; they
fail with an opaque "Protocol error" complaint if not. This will be fixed
in the 9.1 drivers, but until older drivers are no longer in use in the
wild, we'd better leave "UNICODE" alone. Continue to canonicalize all
other inputs. Per report from Steve Singer and subsequent discussion.
Tom Lane [Mon, 18 Apr 2011 19:31:52 +0000 (15:31 -0400)]
Fix handling of collations in multi-row VALUES constructs.
Per spec we ought to apply select_common_collation() across the expressions
in each column of the VALUES table. The original coding was just taking
the first row and assuming it was representative.
This patch adds a field to struct RangeTblEntry to carry the resolved
collations, so initdb is forced for changes in stored rule representation.
Robert Haas [Mon, 18 Apr 2011 14:13:34 +0000 (10:13 -0400)]
Only allow typed tables to hang off composite types, not e.g. tables.
This also ensures that we take a relation lock on the composite type when
creating a typed table, which is necessary to prevent the composite type
and the typed table from getting out of step in the face of concurrent
DDL.
Robert Haas [Mon, 18 Apr 2011 12:27:19 +0000 (08:27 -0400)]
recoveryStopsHere() must check the resource manager ID.
Before commit c016ce728139be95bb0dc7c4e5640507334c2339, this wasn't
needed, but now that multiple resource manager IDs can percolate down
through here, we have to make sure we know which one we've got.
Otherwise, we can confuse (for example) an XLOG_XACT_COMMIT record
with an XLOG_CHECKPOINT_SHUTDOWN record.
Tom Lane [Sun, 17 Apr 2011 22:09:22 +0000 (18:09 -0400)]
Fix assorted infelicities in collation handling in psql's describe.c.
In \d, be more careful to print collation only if it's not the default for
the column's data type. Avoid assuming that the name "default" is magic.
Fix \d on a composite type so that it will print per-column collations.
It's no longer the case that a composite type cannot have modifiers.
(In consequence, the expected outputs for composite-type regression tests
change.)
Fix \dD so that it will print collation for a domain, again only if it's
not the same as the base type's collation.
Tom Lane [Sun, 17 Apr 2011 18:54:19 +0000 (14:54 -0400)]
Support a COLLATE clause in plpgsql variable declarations.
This allows the usual rules for assigning a collation to a local variable
to be overridden. Per discussion, it seems appropriate to support this
rather than forcing all local variables to have the argument-derived
collation.
Tom Lane [Sun, 17 Apr 2011 17:36:38 +0000 (13:36 -0400)]
foreach() and list_delete() don't mix.
Fix crash when releasing duplicate entries in the encoding conversion cache
list, caused by releasing the current entry of the list being chased by
foreach(). We have a standard idiom for handling such cases, but this
loop wasn't using it.
This got broken in my recent rewrite of GUC assign hooks. Not sure how
I missed this when testing the modified code, but I did. Per report from
Peter.
Tom Lane [Sat, 16 Apr 2011 21:26:41 +0000 (17:26 -0400)]
Simplify reindex_relation's API.
For what seem entirely historical reasons, a bitmask "flags" argument was
recently added to reindex_relation without subsuming its existing boolean
argument into that bitmask. This seems a bit bizarre, so fold them
together.
Tom Lane [Sat, 16 Apr 2011 20:39:50 +0000 (16:39 -0400)]
Clean up collation processing in prepunion.c.
This area was a few bricks shy of a load, and badly under-commented too.
We have to ensure that the generated targetlist entries for a set-operation
node expose the correct collation for each entry, since higher-level
processing expects the tlist to reflect the true ordering of the plan's
output.
This hackery wouldn't be necessary if SortGroupClause carried collation
info ... but making it do so would inject more pain in the parser than
would be saved here. Still, we might want to rethink that sometime.
Tom Lane [Sat, 16 Apr 2011 00:18:57 +0000 (20:18 -0400)]
Prevent incorrect updates of pg_index while reindexing pg_index itself.
The places that attempt to change pg_index.indcheckxmin during a reindexing
operation cannot be executed safely if pg_index itself is the subject of
the operation. This is the explanation for a couple of recent reports of
VACUUM FULL failing with
ERROR: duplicate key value violates unique constraint "pg_index_indexrelid_index"
DETAIL: Key (indexrelid)=(2678) already exists.
However, there isn't any real need to update indcheckxmin in such a
situation, if we assume that pg_index can never contain a truly broken HOT
chain. This assumption holds if new indexes are never created on it during
concurrent operations, which is something we don't consider safe for any
system catalog, not just pg_index. Accordingly, modify the code to not
manipulate indcheckxmin when reindexing any system catalog.
Back-patch to 8.3, where HOT was introduced. The known failure scenarios
involve 9.0-style VACUUM FULL, so there might not be any real risk before
9.0, but let's not assume that.