Tom Lane [Tue, 19 Apr 2011 22:50:56 +0000 (18:50 -0400)]
Avoid changing an index's indcheckxmin horizon during REINDEX.
There can never be a need to push the indcheckxmin horizon forward, since
any HOT chains that are actually broken with respect to the index must
pre-date its original creation. So we can just avoid changing pg_index
altogether during a REINDEX operation.
This offers a cleaner solution than my previous patch for the problem
found a few days ago that we mustn't try to update pg_index while we are
reindexing it. System catalog indexes will always be created with
indcheckxmin = false during initdb, and with this modified code we should
never try to change their pg_index entries. This avoids special-casing
system catalogs as the former patch did, and should provide a performance
benefit for many cases where REINDEX formerly caused an index to be
considered unusable for a short time.
Back-patch to 8.3 to cover all versions containing HOT. Note that this
patch changes the API for index_build(), but I believe it is unlikely that
any add-on code is calling that directly.
Peter Eisentraut [Tue, 19 Apr 2011 19:52:52 +0000 (22:52 +0300)]
Refix the unaccent regression test on MSVC properly
... for some value of "properly". Instead of overriding REGRESS_OPTS,
set the variables ENCODING and NO_LOCALE, which is more expressive and
allows overriding by the user. Fix vcregress.pl to handle that.
Tom Lane [Tue, 19 Apr 2011 16:17:13 +0000 (12:17 -0400)]
Refrain from canonicalizing a client_encoding setting of "UNICODE".
While "UTF8" is the correct name for this encoding, existing JDBC drivers
expect that if they send "UNICODE" it will read back the same way; they
fail with an opaque "Protocol error" complaint if not. This will be fixed
in the 9.1 drivers, but until older drivers are no longer in use in the
wild, we'd better leave "UNICODE" alone. Continue to canonicalize all
other inputs. Per report from Steve Singer and subsequent discussion.
Tom Lane [Mon, 18 Apr 2011 19:31:52 +0000 (15:31 -0400)]
Fix handling of collations in multi-row VALUES constructs.
Per spec we ought to apply select_common_collation() across the expressions
in each column of the VALUES table. The original coding was just taking
the first row and assuming it was representative.
This patch adds a field to struct RangeTblEntry to carry the resolved
collations, so initdb is forced for changes in stored rule representation.
Robert Haas [Mon, 18 Apr 2011 14:13:34 +0000 (10:13 -0400)]
Only allow typed tables to hang off composite types, not e.g. tables.
This also ensures that we take a relation lock on the composite type when
creating a typed table, which is necessary to prevent the composite type
and the typed table from getting out of step in the face of concurrent
DDL.
Robert Haas [Mon, 18 Apr 2011 12:27:19 +0000 (08:27 -0400)]
recoveryStopsHere() must check the resource manager ID.
Before commit c016ce728139be95bb0dc7c4e5640507334c2339, this wasn't
needed, but now that multiple resource manager IDs can percolate down
through here, we have to make sure we know which one we've got.
Otherwise, we can confuse (for example) an XLOG_XACT_COMMIT record
with an XLOG_CHECKPOINT_SHUTDOWN record.
Tom Lane [Sun, 17 Apr 2011 22:09:22 +0000 (18:09 -0400)]
Fix assorted infelicities in collation handling in psql's describe.c.
In \d, be more careful to print collation only if it's not the default for
the column's data type. Avoid assuming that the name "default" is magic.
Fix \d on a composite type so that it will print per-column collations.
It's no longer the case that a composite type cannot have modifiers.
(In consequence, the expected outputs for composite-type regression tests
change.)
Fix \dD so that it will print collation for a domain, again only if it's
not the same as the base type's collation.
Tom Lane [Sun, 17 Apr 2011 18:54:19 +0000 (14:54 -0400)]
Support a COLLATE clause in plpgsql variable declarations.
This allows the usual rules for assigning a collation to a local variable
to be overridden. Per discussion, it seems appropriate to support this
rather than forcing all local variables to have the argument-derived
collation.
Tom Lane [Sun, 17 Apr 2011 17:36:38 +0000 (13:36 -0400)]
foreach() and list_delete() don't mix.
Fix crash when releasing duplicate entries in the encoding conversion cache
list, caused by releasing the current entry of the list being chased by
foreach(). We have a standard idiom for handling such cases, but this
loop wasn't using it.
This got broken in my recent rewrite of GUC assign hooks. Not sure how
I missed this when testing the modified code, but I did. Per report from
Peter.
Tom Lane [Sat, 16 Apr 2011 21:26:41 +0000 (17:26 -0400)]
Simplify reindex_relation's API.
For what seem entirely historical reasons, a bitmask "flags" argument was
recently added to reindex_relation without subsuming its existing boolean
argument into that bitmask. This seems a bit bizarre, so fold them
together.
Tom Lane [Sat, 16 Apr 2011 20:39:50 +0000 (16:39 -0400)]
Clean up collation processing in prepunion.c.
This area was a few bricks shy of a load, and badly under-commented too.
We have to ensure that the generated targetlist entries for a set-operation
node expose the correct collation for each entry, since higher-level
processing expects the tlist to reflect the true ordering of the plan's
output.
This hackery wouldn't be necessary if SortGroupClause carried collation
info ... but making it do so would inject more pain in the parser than
would be saved here. Still, we might want to rethink that sometime.
Tom Lane [Sat, 16 Apr 2011 00:18:57 +0000 (20:18 -0400)]
Prevent incorrect updates of pg_index while reindexing pg_index itself.
The places that attempt to change pg_index.indcheckxmin during a reindexing
operation cannot be executed safely if pg_index itself is the subject of
the operation. This is the explanation for a couple of recent reports of
VACUUM FULL failing with
ERROR: duplicate key value violates unique constraint "pg_index_indexrelid_index"
DETAIL: Key (indexrelid)=(2678) already exists.
However, there isn't any real need to update indcheckxmin in such a
situation, if we assume that pg_index can never contain a truly broken HOT
chain. This assumption holds if new indexes are never created on it during
concurrent operations, which is something we don't consider safe for any
system catalog, not just pg_index. Accordingly, modify the code to not
manipulate indcheckxmin when reindexing any system catalog.
Back-patch to 8.3, where HOT was introduced. The known failure scenarios
involve 9.0-style VACUUM FULL, so there might not be any real risk before
9.0, but let's not assume that.
Tom Lane [Fri, 15 Apr 2011 21:45:41 +0000 (17:45 -0400)]
Guard against incoming rowcount estimate of NaN in cost_mergejoin().
Although rowcount estimates really ought not be NaN, a bug elsewhere
could perhaps result in that, and that would cause Assert failure in
cost_mergejoin, which I believe to be the explanation for bug #5977 from
Anton Kuznetsov. Seems like a good idea to expend a couple more cycles
to prevent that, even though the real bug is elsewhere. Not back-patching,
though, because we don't encourage running production systems with
Asserts on.
setlocale() on Windows doesn't work correctly if the locale name contains
apostrophes or dots. There isn't much hope of Microsoft fixing it any time
soon, it's been like that for ages, so we better work around it. So, map a
few common Windows locale names known to cause problems to aliases that work.
On Windows, if the encoding implied by locale is not allowed as a
server-encoding, fall back to UTF-8. It happens at least with the Chinese
locale, which implies BIG5. This is safe, because on Windows all locales
are compatible with UTF-8.
Reduce the initial size of local lock hash to 16 entries.
The hash table is seq scanned at transaction end, to release all locks,
and making the hash table larger than necessary makes that slower. With
very simple queries, that overhead can amount to a few percent of the total
CPU time used.
At the moment, backend startup needs 6 locks, and a simple query with one
table and index needs 3 locks. 16 is enough for even quite complicated
transactions, and it will grow automatically if it fills up.
Robert Haas [Thu, 14 Apr 2011 02:20:39 +0000 (19:20 -0700)]
Remove obsolete comment.
The lock level for adding a parent table is now ShareUpdateExclusiveLock;
see commit fbcf4b92aa64d4577bcf25925b055316b978744a. This comment didn't
get updated to match, but it doesn't seem important to mention this detail
here, so rather than updating it now, just take it out.
Robert Haas [Thu, 14 Apr 2011 01:07:14 +0000 (18:07 -0700)]
Fix toast table creation.
Instead of using slightly-too-clever heuristics to decide when we must
create a TOAST table, just check whether one is needed every time the
table is altered. Checking whether a toast table is needed is cheap
enough that we needn't worry about doing it on every ALTER TABLE command,
and the previous coding is apparently prone to accidental breakage:
commit 04e17bae50a73af524731fa11210d5c3f7d8e1f9 broken ALTER TABLE ..
SET STORAGE, which moved some actions from AT_PASS_COL_ATTRS to
AT_PASS_MISC, and commit 6c5723998594dffa5d47c3cf8c96ccf89c033aae broke
ALTER TABLE .. ADD COLUMN by changing the way that adding columns
recurses into child tables.
Tom Lane [Wed, 13 Apr 2011 22:56:40 +0000 (18:56 -0400)]
Ensure mark_dummy_rel doesn't create dangling pointers in RelOptInfos.
When we are doing GEQO join planning, the current memory context is a
short-lived context that will be reset at the end of geqo_eval(). However,
the RelOptInfos for base relations are set up before that and then re-used
across many GEQO cycles. Hence, any code that modifies a baserel during
join planning has to be careful not to put pointers to the short-lived
context into the baserel struct. mark_dummy_rel got this wrong, leading to
easy-to-reproduce-once-you-know-how crashes in 8.4, as reported off-list by
Leo Carson of SDSC. Some improvements made in 9.0 make it difficult to
demonstrate the crash in 9.0 or HEAD; but there's no doubt that there's
still a risk factor here, so patch all branches that have the function.
(Note: 8.3 has a similar function, but it's only applied to joinrels and
thus is not a hazard.)
On HP/UX, the structs used by ioctl(SIOCGLIFCONF) are named differently
than on other platforms, and only IPv6 addresses are returned. Because of
those two issues, fall back to ioctl(SIOCGIFCONF) on HP/UX, so that it at
least compiles and finds IPv4 addresses. This function is currently only
used for interpreting samehost/samenet in pg_hba.conf, which isn't that
critical.
Revert the patch to check if we've reached end-of-backup also when doing
crash recovery, and throw an error if not. hubert depesz lubaczewski pointed
out that that situation also happens in the crash recovery following a
system crash that happens during an online backup.
We might want to do something smarter in 9.1, like put the check back for
backups taken with pg_basebackup, but that's for another patch.
On IA64 architecture, we check the depth of the register stack in addition
to the regular stack. The code to do that is platform and compiler specific,
add support for the HP-UX native compiler.
Tom Lane [Tue, 12 Apr 2011 23:19:24 +0000 (19:19 -0400)]
Pass collations to functions in FunctionCallInfoData, not FmgrInfo.
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings. Fix by passing it in FunctionCallInfoData
instead. In particular this allows a clean fix for bug #5970 (record_cmp
not working). This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
Tom Lane [Tue, 12 Apr 2011 06:05:24 +0000 (02:05 -0400)]
Suppress compiler warnings about "value computed is not used".
The recent patch to remove gcc 4.6 warnings created some new ones, at
least on my rather old gcc version. Try to make everybody happy by
casting to "void" when we just want to discard the result.
Tom Lane [Tue, 12 Apr 2011 05:59:34 +0000 (01:59 -0400)]
Be more wary of missing statistics in eqjoinsel_semi().
In particular, if we don't have real ndistinct estimates for both sides,
fall back to assuming that half of the left-hand rows have join partners.
This is what was done in 8.2 and 8.3 (cf nulltestsel() in those versions).
It's pretty stupid but it won't lead us to think that an antijoin produces
no rows out, as seen in recent example from Uwe Schroeder.
Tom Lane [Tue, 12 Apr 2011 01:32:53 +0000 (21:32 -0400)]
Fix RI_Initial_Check to use a COLLATE clause when needed in its query.
If the referencing and referenced columns have different collations,
the parser will be unable to resolve which collation to use unless it's
helped out in this way. The effects are sometimes masked, if we end up
using a non-collation-sensitive plan; but if we do use a mergejoin
we'll see a failure, as recently noted by Robert Haas.
The SQL spec states that the referenced column's collation should be used
to resolve RI checks, so that's what we do. Note however that we currently
don't append a COLLATE clause when writing a query that examines only the
referencing column. If we ever support collations that have varying
notions of equality, that will have to be changed. For the moment, though,
it's preferable to leave it off so that we can use a normal index on the
referencing column.
Peter Eisentraut [Mon, 11 Apr 2011 19:28:45 +0000 (22:28 +0300)]
Clean up most -Wunused-but-set-variable warnings from gcc 4.6
This warning is new in gcc 4.6 and part of -Wall. This patch cleans
up most of the noise, but there are some still warnings that are
trickier to remove.
Tom Lane [Mon, 11 Apr 2011 16:28:28 +0000 (12:28 -0400)]
Teach pattern_fixed_prefix() about collations.
This is necessary, not optional, now that ILIKE and regexes are collation
aware --- else we might derive a wrong comparison constant for index
optimized pattern matches.
TransferPredicateLocksToNewTarget should initialize a new lock
entry's commitSeqNo to that of the old one being transferred, or take
the minimum commitSeqNo if it is merging two lock entries.
Also, CreatePredicateLock should initialize commitSeqNo for to
InvalidSerCommitSeqNo instead of to 0. (I don't think using 0 would
actually affect anything, but we should be consistent.)
I also added a couple of assertions I used to track this down: a
lock's commitSeqNo should never be zero, and it should be
InvalidSerCommitSeqNo if and only if the lock is not held by
OldCommittedSxact.
Dan Ports, to fix leak of predicate locks reported by YAMAMOTO Takashi.
Tom Lane [Sun, 10 Apr 2011 22:02:17 +0000 (18:02 -0400)]
Teach regular expression operators to honor collations.
This involves getting the character classification and case-folding
functions in the regex library to use the collations infrastructure.
Most of this work had been done already in connection with the upper/lower
and LIKE logic, so it was a simple matter of transposition.
While at it, split out these functions into a separate source file
regc_pg_locale.c, so that they can be correctly labeled with the Postgres
project's license rather than the Scriptics license. These functions are
100% Postgres-written code whereas what remains in regc_locale.c is still
mostly not ours, so lumping them both under the same copyright notice was
getting more and more misleading.
Tom Lane [Sat, 9 Apr 2011 21:12:39 +0000 (17:12 -0400)]
Fix ILIKE to honor collation when working in single-byte encodings.
The original collation patch only fixed the multi-byte code path.
This change also ensures that ILIKE's idea of the case-folding rules
is exactly the same as str_tolower's.
Tom Lane [Sat, 9 Apr 2011 20:24:36 +0000 (16:24 -0400)]
Remove collate.linux.utf8.sql's assumptions about ".utf8" in locale names.
Tweak the test so that it does not depend on the platform using ".utf8" as
the extension signifying that a locale uses UTF8 encoding. For the most
part this just requires using the abbreviated collation names "en_US" etc,
though I had to work a bit harder on the collation creation tests.
This opens the door to using the test on platforms that spell locales
differently, for example ".utf-8" or ".UTF-8". Also, the test is now
somewhat useful with server encodings other than UTF8; though depending on
which encoding is selected, different subsets of it will fail for lack of
character set support.
Tom Lane [Sat, 9 Apr 2011 18:40:09 +0000 (14:40 -0400)]
Adjust collation determination rules as per discussion.
Remove crude hack that tried to propagate collation through a
function-returning-record, ie, from the function's arguments to individual
fields selected from its result record. That is just plain inconsistent,
because the function result is composite and cannot have a collation;
and there's no hope of making this kind of action-at-a-distance work
consistently. Adjust regression test cases that expected this to happen.
Meanwhile, the behavior of casting to a domain with a declared collation
stays the same as it was, since that seemed to be the consensus.
Tom Lane [Sat, 9 Apr 2011 18:08:41 +0000 (14:08 -0400)]
Don't show unusable collations in psql's \dO command.
"Unusable" collations are those not matching the current database's
encoding. The former behavior inconsistently showed such collations
some of the time, depending on the details of the pattern argument.
Tom Lane [Fri, 8 Apr 2011 23:19:17 +0000 (19:19 -0400)]
Clean up minor collation issues in indxpath.c.
Get rid of bogus collation test in match_special_index_operator (even for
ILIKE, the pattern match operator's collation doesn't matter here, and even
if it did the test was testing the wrong thing).
Fix broken looping logic in expand_indexqual_rowcompare.
Add collation check in match_clause_to_ordering_op.
Make naming and argument ordering more consistent; improve comments.
Tom Lane [Fri, 8 Apr 2011 21:39:59 +0000 (17:39 -0400)]
Fix make_greater_string to not have an undocumented collation assumption.
The previous coding worked only if ltproc->fn_collation was always either
DEFAULT_COLLATION_OID or a C-compatible locale. While that's true at the
moment, it wasn't documented (and in fact wasn't true when this code was
committed...). But it only takes a couple more lines to make its internal
caching behavior locale-aware, so let's do that.
Robert Haas [Fri, 8 Apr 2011 20:51:45 +0000 (16:51 -0400)]
Truncate the predicate lock SLRU to empty, instead of almost empty.
Otherwise, the SLRU machinery can get confused and think that the SLRU
has wrapped around. Along the way, regardless of whether we're
truncating all of the SLRU or just some of it, flush pages after
truncating, rather than before.
Tom Lane [Fri, 8 Apr 2011 20:48:25 +0000 (16:48 -0400)]
Tweak collation setup for GIN index comparison functions.
Honor index column's collation spec if there is one, don't go to the
expense of calling get_typcollation when we can reasonably assume that
all GIN storage types will use default collation, and be sure to set
a collation for the comparePartialFn too.
Tom Lane [Fri, 8 Apr 2011 20:11:04 +0000 (16:11 -0400)]
Avoid an unnecessary syscache lookup in parse_coerce.c.
All the other fields of the constant are being extracted from the syscache
entry we already have, so handle collation similarly. (There don't seem
to be any other uses for the new function at the moment.)
Tom Lane [Fri, 8 Apr 2011 19:38:57 +0000 (15:38 -0400)]
Modify initdb to complain only when no usable system locales are found.
Per discussion, the original behavior seems too noisy. But if things
are so broken that none of the locales reported by "locale -a" are usable,
that's probably worth warning about.
Robert Haas [Fri, 8 Apr 2011 19:29:02 +0000 (15:29 -0400)]
Partially roll back overenthusiastic SSI optimization.
When a regular lock is held, SSI can use that in lieu of a predicate lock
to detect rw conflicts; but if the regular lock is being taken by a
subtransaction, we can't assume that it'll commit, so releasing the
parent transaction's lock in that case is a no-no.
Tom Lane [Fri, 8 Apr 2011 15:36:05 +0000 (11:36 -0400)]
Avoid extra whitespace in the arguments of <indexterm>.
As noted by Thom Brown, this confuses the DocBook index processor; it
fails to merge entries that differ only in whitespace, and sorts them
unexpectedly as well. Seems like a toolchain bug, but I'm not going to
hold my breath waiting for a fix.
Note: easiest way to find these is to look for double spaces in HTML.index.
Tom Lane [Fri, 8 Apr 2011 14:54:03 +0000 (10:54 -0400)]
Add an example of WITH (UPDATE RETURNING) INSERT to the INSERT ref page.
Per a discussion with Gavin Flower. This barely scratches the surface
of potential WITH (something RETURNING) use cases, of course, but it's
one of the simplest compelling examples I can think of.
Robert Haas [Thu, 7 Apr 2011 20:43:39 +0000 (16:43 -0400)]
Tweaks for SSI out-of-shared memory behavior.
If we call hash_search() with HASH_ENTER, it will bail out rather than
return NULL, so it's redundant to check for NULL again in the caller.
Thus, in cases where we believe it's impossible for the hash table to run
out of slots anyway, we can simplify the code slightly.
On the flip side, in cases where it's theoretically possible to run out of
space, we don't want to rely on dynahash.c to throw an error; instead,
we pass HASH_ENTER_NULL and throw the error ourselves if a NULL comes
back, so that we can provide a more descriptive error message.
Tom Lane [Thu, 7 Apr 2011 19:14:39 +0000 (15:14 -0400)]
Modernize dlopen interface code for FreeBSD and OpenBSD.
Remove the hard-wired assumption that __mips__ (and only __mips__) lacks
dlopen in FreeBSD and OpenBSD. This assumption is outdated at least for
OpenBSD, as per report from an anonymous 9.1 tester. We can perfectly well
use HAVE_DLOPEN instead to decide which code to use.
Some other cosmetic adjustments to make freebsd.c, netbsd.c, and openbsd.c
exactly alike.
Tom Lane [Thu, 7 Apr 2011 15:40:23 +0000 (11:40 -0400)]
Fix SortTocFromFile() to cope with lines that are too long for its buffer.
The original coding supposed that a dump TOC file could never contain lines
longer than 1K. The folly of that was exposed by a recent report from
Per-Olov Esgard. We only really need to see the first dozen or two bytes
of each line, since we're just trying to read off the numeric ID at the
start of the line; so there's no need for a particularly huge buffer.
What there is a need for is logic to not process continuation bufferloads.
Back-patch to all supported branches, since it's always been like this.
Bruce Momjian [Thu, 7 Apr 2011 13:57:09 +0000 (09:57 -0400)]
Preserve pg_largeobject_metadata.relfrozenxid in pg_upgrade.
This is needed only in 9.1 because only 9.0 had this and no one is
upgrading from a 9.0 beta to 9.0 anymore. We basically don't backpatch
9.0 beta fixes at this point.
Tom Lane [Thu, 7 Apr 2011 06:34:57 +0000 (02:34 -0400)]
Fix collations when we call transformWhereClause from outside the parser.
Previous patches took care of assorted places that call transformExpr from
outside the main parser, but I overlooked the fact that some places use
transformWhereClause as a shortcut for transformExpr + coerce_to_boolean.
In particular this broke collation-sensitive index WHERE clauses, as per
report from Thom Brown. Trigger WHEN and rule WHERE clauses too.
I'm not forcing initdb for this fix, but any affected indexes, triggers,
or rules will need to be dropped and recreated.
Tom Lane [Thu, 7 Apr 2011 04:11:01 +0000 (00:11 -0400)]
Revise the API for GUC variable assign hooks.
The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment. And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server". There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead). This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".
Robert Haas [Tue, 5 Apr 2011 19:16:59 +0000 (15:16 -0400)]
Repair some flakiness in CheckTargetForConflictsIn.
When we release and reacquire SerializableXactHashLock, we must recheck
whether an R/W conflict still needs to be flagged, because it could have
changed under us in the meantime. And when we release the partition
lock, we must re-walk the list of predicate locks from the beginning,
because our pointer could get invalidated under us.
Bug report #5952 by Yamamoto Takashi. Patch by Kevin Grittner.
Simon Riggs [Mon, 4 Apr 2011 22:23:13 +0000 (23:23 +0100)]
Avoid assuming there will be only 3 states for synchronous_commit.
Also avoid hardcoding the current default state by giving it the name
"on" and replace with a meaningful name that reflects its behaviour.
Coding only, no change in behaviour.