Tom Lane [Sun, 7 Jun 2009 20:09:34 +0000 (20:09 +0000)]
Revert my patch of 2009-04-04 that removed contrib/intarray's definitions of
the <@ and @> operators. These are not in fact equivalent to the built-in
anyarray operators of the same names, because they have different behavior for
empty arrays, namely they don't think empty arrays are contained in anything.
That is mathematically wrong, no doubt, but until we can persuade GIN indexes
to implement the mathematical definition we should probably not change this.
Another reason for not changing it now is that we can't yet ensure the
opclasses will be updated correctly in a dump-and-reload upgrade. Per
recent discussions.
Tom Lane [Sat, 6 Jun 2009 22:13:52 +0000 (22:13 +0000)]
Improve the IndexVacuumInfo/IndexBulkDeleteResult API to allow somewhat sane
behavior in cases where we don't know the heap tuple count accurately; in
particular partial vacuum, but this also makes the API a bit more useful
for ANALYZE. This patch adds "estimated_count" flags to both structs so
that an approximate count can be flagged as such, and adjusts the logic
so that approximate counts are not used for updating pg_class.reltuples.
This fixes my previous complaint that VACUUM was putting ridiculous values
into pg_class.reltuples for indexes. The actual impact of that bug is
limited, because the planner only pays attention to reltuples for an index
if the index is partial; which probably explains why beta testers hadn't
noticed a degradation in plan quality from it. But it needs to be fixed.
The whole thing is a bit messy and should be redesigned in future, because
reltuples now has the potential to drift quite far away from reality when
a long period elapses with no non-partial vacuums. But this is as good as
it's going to get for 8.4.
Joe Conway [Sat, 6 Jun 2009 21:27:56 +0000 (21:27 +0000)]
Add support for using SQL/MED compliant FOREIGN DATA WRAPPER, SERVER,
and USER MAPPING as method to supply dblink connect parameters. Per
mailing list and PGCon discussions.
Tom Lane [Sat, 6 Jun 2009 02:39:40 +0000 (02:39 +0000)]
Fix a serious bug introduced into GIN in 8.4: now that MergeItemPointers()
is supposed to remove duplicate heap TIDs, we have to be sure to reduce the
tuple size and posting-item count accordingly in addItemPointersToTuple().
Failing to do so resulted in the effective injection of garbage TIDs into the
index contents, ie, whatever happened to be in the memory palloc'd for the
new tuple. I'm not sure that this fully explains the index corruption
reported by Tatsuo Ishii, but the test case I'm using no longer fails.
Tom Lane [Fri, 5 Jun 2009 18:50:47 +0000 (18:50 +0000)]
GIN's ItemPointerIsMin, ItemPointerIsMax, and ItemPointerIsLossyPage macros
should use GinItemPointerGetBlockNumber/GinItemPointerGetOffsetNumber,
not ItemPointerGetBlockNumber/ItemPointerGetOffsetNumber, because the latter
will Assert() on ip_posid == 0, ie a "Min" pointer. (Thus, ItemPointerIsMin
has never worked at all, but it seems unused at present.) I'm not certain
that the case can occur in normal functioning, but it's blowing up on me
while investigating Tatsuo-san's data corruption problem. In any case it
seems like a problem waiting to bite someone.
Back-patch just in case this really is a problem for somebody in the field.
Tom Lane [Thu, 4 Jun 2009 19:16:48 +0000 (19:16 +0000)]
Remove a couple of debugging messages that have been #ifdef'd out for ages.
Seems silly to ask translators to expend work on these, especially in
pluralized variants.
Tom Lane [Thu, 4 Jun 2009 18:33:08 +0000 (18:33 +0000)]
Improve the recently-added support for properly pluralized error messages
by extending the ereport() API to cater for pluralization directly. This
is better than the original method of calling ngettext outside the elog.c
code because (1) it avoids double translation, which wastes cycles and in
the worst case could give a wrong result; and (2) it avoids having to use
a different coding method in PL code than in the core backend. The
client-side uses of ngettext are not touched since neither of these concerns
is very pressing in the client environment. Per my proposal of yesterday.
Tom Lane [Wed, 3 Jun 2009 20:24:51 +0000 (20:24 +0000)]
Clean up ecpg's use of mmerror(): const-ify the format argument, add an
__attribute__() marker so that gcc can validate the format string against
the actual arguments, get rid of overcomplicated and unsafe usage in
base_yyerror().
Tom Lane [Wed, 3 Jun 2009 16:17:49 +0000 (16:17 +0000)]
Change rather bizarre code ordering in get_id(). This isn't strictly
cosmetic --- I'm wondering if geteuid could have side effects on errno,
thus possibly resulting in a misleading error message after failure of
getpwuid.
Tom Lane [Tue, 2 Jun 2009 17:37:55 +0000 (17:37 +0000)]
Remove the old advice to keep from_collapse_limit less than geqo_threshold,
instead just pointing out that a larger value may trigger use of GEQO.
Per Robert Haas.
In passing, do a bit of wordsmithing on the Genetic Query Optimizer section.
Only recycle normal files in pg_xlog as WAL segments. pg_standby creates
symbolic links with the -l option, and as Fujii Masao pointed out we ended up
overwriting files in the archive directory before this patch. Patch by
Aidan Van Dyk, Fujii Masao and me.
Backpatch to 8.3, where pg_standby was introduced.
Joe Conway [Tue, 2 Jun 2009 03:21:56 +0000 (03:21 +0000)]
Fix dblink_get_result() as reported by Oleksiy Shchukin. Refactor a bit
while we're at it per request by Tom Lane. Specifically, don't try to
perform dblink_send_query() via dblink_record_internal() -- it was
inappropriate and ugly.
Tom Lane [Mon, 1 Jun 2009 23:55:15 +0000 (23:55 +0000)]
Change AdjustIntervalForTypmod to not discard higher-order field values on the
grounds that they don't fit into the specified interval qualifier (typmod).
This behavior, while of long standing, is clearly wrong per spec --- for
example the value INTERVAL '999' SECOND means 999 seconds and should not be
reduced to less than 60 seconds.
In some cases there could be grounds to raise an error if higher-order field
values are not given as zero; for example '1 year 1 month'::INTERVAL MONTH
should arguably be taken as an error rather than equivalent to 13 months.
However our internal representation doesn't allow us to do that in a fashion
that would consistently reject all and only the cases that a strict reading
of the spec would suggest. Also, seeing that for example INTERVAL '13' MONTH
will print out as '1 year 1 mon', we have to be careful not to create a
situation where valid data will fail to dump and reload. The present patch
therefore takes the attitude of not throwing an error in any such case.
We might want to revisit that in future but it would take more redesign
than seems prudent in late beta.
Per a complaint from Sebastien Flaesch and subsequent discussion. While
at other times we might have just postponed such an issue to the next
development cycle, 8.4 already has changed the parsing of interval literals
quite a bit in an effort to accept all spec-compliant cases correctly.
This seems like a change that should be part of that rather than coming
along later.
Tom Lane [Mon, 1 Jun 2009 16:55:11 +0000 (16:55 +0000)]
Fix DecodeInterval to report an error for multiple occurrences of DAY, WEEK,
YEAR, DECADE, CENTURY, or MILLENIUM fields, just as it always has done for
other types of fields. The previous behavior seems to have been a hack to
avoid defining bit-positions for all these field types in DTK_M() masks,
rather than something that was really considered to be desired behavior.
But there is room in the masks for these, and we really need to tighten up
at least the behavior of DAY and YEAR fields to avoid unexpected behavior
associated with the 8.4 changes to interpret ambiguous fields based on the
interval qualifier (typmod) value. Per my example and proposed patch.
Tom Lane [Sun, 31 May 2009 20:55:37 +0000 (20:55 +0000)]
Update obsolete comment in index_drop(). When the comment was written,
queries frequently took no lock at all on individual indexes. That's not
true any more, but we still need lock on the parent table to make it safe
to use cached lists of index OIDs.
Tom Lane [Wed, 27 May 2009 22:12:53 +0000 (22:12 +0000)]
Improve release note explanation of the change in libpq's handling of
default usernames versus Kerberos tickets. Per confusion about what
bug #4824 was really about.
Magnus Hagander [Wed, 27 May 2009 21:08:22 +0000 (21:08 +0000)]
Properly return the usermap result when doing gssapi authentication. Without
this, the username was in practice never matched against the kerberos principal
used to log in.
Tom Lane [Wed, 27 May 2009 20:42:29 +0000 (20:42 +0000)]
Ignore RECHECK in CREATE OPERATOR CLASS, just throwing a NOTICE, instead of
throwing an error as 8.4 had been doing. The error interfered with porting
old database definitions (particularly for pg_migrator) without really buying
any safety. Per bug #4817 and subsequent discussion.
Tom Lane [Wed, 27 May 2009 01:18:06 +0000 (01:18 +0000)]
Improve documentation about function volatility: mention the snapshot
visibility effects in a couple of places where people are likely to look
for it. Per discussion of recent question from Karl Nack.
Tom Lane [Tue, 26 May 2009 17:36:05 +0000 (17:36 +0000)]
Allow the second argument of pg_get_expr() to be just zero when deparsing
an expression that's not supposed to contain variables. Per discussion
with Gevik Babakhani, this eliminates the need for an ugly kluge (namely,
specifying some unrelated relation name). Remove one such kluge from
pg_dump.
Tom Lane [Tue, 26 May 2009 02:17:50 +0000 (02:17 +0000)]
Remove the useless and rather inconsistent return values of EncodeDateOnly,
EncodeTimeOnly, EncodeDateTime, EncodeInterval. These don't have any good
reason to fail, and their callers were mostly not checking anyway.
Tom Lane [Tue, 26 May 2009 01:29:09 +0000 (01:29 +0000)]
Add range checks to time_recv() and timetz_recv(), to prevent binary input
of time values that would not be accepted via textual input.
Per gripe from Andrew McNamara.
This is potentially a back-patchable bug fix, but for the moment it doesn't
seem sufficiently high impact to justify doing that.
Tom Lane [Sun, 24 May 2009 18:10:38 +0000 (18:10 +0000)]
Fix LIKE's special-case code for % followed by _. I'm not entirely sure that
this case is worth a special code path, but a special code path that gets
the boundary condition wrong is definitely no good. Per bug #4821 from
Andrew Gierth.
In passing, clean up some minor code formatting issues (excess parentheses
and blank lines in odd places).
Teodor Sigaev [Thu, 21 May 2009 20:09:36 +0000 (20:09 +0000)]
Resort tsvector's lexemes in tsvectorrecv instead of emmiting an error.
Basically, it's needed to support binary dump from 8.3 because ordering rule
was changed.
Update relpages and reltuples estimates in stand-alone ANALYZE, even if
there's no analyzable attributes or indexes. We also used to report 0 live
and dead tuples for such tables, which messed with autovacuum threshold
calculations.
This fixes bug #4812 reported by George Su. Backpatch back to 8.1.
Peter Eisentraut [Mon, 18 May 2009 08:59:29 +0000 (08:59 +0000)]
Some documentation cleanup for the addition of the KOI8U encoding. Change
all (remaining) mentions of KOI8 to the new canonical form KOI8R. Add
information about the available conversions for KOI8U.
Tom Lane [Fri, 15 May 2009 15:56:39 +0000 (15:56 +0000)]
Fix all the server-side SIGQUIT handlers (grumble ... why so many identical
copies?) to ensure they really don't run proc_exit/shmem_exit callbacks,
as was intended. I broke this behavior recently by installing atexit
callbacks without thinking about the one case where we truly don't want
to run those callback functions. Noted in an example from Dave Page.
Add recovery_end_command option to recovery.conf. recovery_end_command
is run at the end of archive recovery, providing a chance to do external
cleanup. Modify pg_standby so that it no longer removes the trigger file,
that is to be done using the recovery_end_command now.
Provide a "smart" failover mode in pg_standby, where we don't fail over
immediately, but only after recovering all unapplied WAL from the archive.
That gives you zero data loss assuming all WAL was archived before
failover, which is what most users of pg_standby actually want.
recovery_end_command by Simon Riggs, pg_standby changes by Fujii Masao and
myself.
Tom Lane [Wed, 13 May 2009 22:32:55 +0000 (22:32 +0000)]
Add checks to DefineQueryRewrite() to prohibit attaching rules to relations
that aren't RELKIND_RELATION or RELKIND_VIEW, and to disallow attaching rules
to system relations unless allowSystemTableMods is on. This is to make the
behavior of CREATE RULE more like CREATE TRIGGER, which disallows the
comparable cases. Per discussion of bug #4808.
Tom Lane [Wed, 13 May 2009 20:27:17 +0000 (20:27 +0000)]
Rewrite xml.c's memory management (yet again). Give up on the idea of
redirecting libxml's allocations into a Postgres context. Instead, just let
it use malloc directly, and add PG_TRY blocks as needed to be sure we release
libxml data structures in error recovery code paths. This is ugly but seems
much more likely to play nicely with third-party uses of libxml, as seen in
recent trouble reports about using Perl XML facilities in pl/perl and bug
#4774 about contrib/xml2.
I left the code for allocation redirection in place, but it's only
built/used if you #define USE_LIBXMLCONTEXT. This is because I found it
useful to corral libxml's allocations in a palloc context when hunting
for libxml memory leaks, and we're surely going to have more of those
in the future with this type of approach. But we don't want it turned on
in a normal build because it breaks exactly what we need to fix.
I have not re-indented most of the code sections that are now wrapped
by PG_TRY(); that's for ease of review. pg_indent will fix it.
This is a pre-existing bug in 8.3, but I don't dare back-patch this change
until it's gotten a reasonable amount of field testing.
Tom Lane [Tue, 12 May 2009 20:17:40 +0000 (20:17 +0000)]
Fix intratransaction memory leaks in xml_recv, xmlconcat, xmlroot, and
xml_parse, all arising from the same sloppy usage of parse_xml_decl.
The original coding had that function returning its output string
parameters in the libxml context, which is long-lived, and all but one
of its callers neglected to free the strings afterwards. The easiest
and most bulletproof fix is to return the strings in the local palloc
context instead, since that's short-lived. This was only costing a
dozen or two bytes per function call, but that adds up fast if the
function is called repeatedly ...
Noted while poking at the more general problem of what to do with our
libxml memory allocation hooks. Back-patch to 8.3, which has the
identical coding.
Tom Lane [Tue, 12 May 2009 16:43:32 +0000 (16:43 +0000)]
Fix LOCK TABLE to eliminate the race condition that could make it give weird
errors when tables are concurrently dropped. To do this we must take lock
on each relation before we check its privileges. The old code was trying
to do that the other way around, which is a bit pointless when there are lots
of other commands that lock relations before checking privileges. I did keep
it checking each relation's privilege before locking the next relation, which
is a detail that ALTER TABLE isn't too picky about.
Tom Lane [Tue, 12 May 2009 03:11:02 +0000 (03:11 +0000)]
Modify find_inheritance_children() and find_all_inheritors() to add the
ability to lock relations as they scan pg_inherits, and to ignore any
relations that have disappeared by the time we get lock on them. This
makes uses of these functions safe against concurrent DROP operations
on child tables: we will effectively ignore any just-dropped child,
rather than possibly throwing an error as in recent bug report from
Thomas Johansson (and similar past complaints). The behavior should
not change otherwise, since the code was acquiring those same locks
anyway, just a little bit later.
An exception is LockTableCommand(), which is still behaving unsafely;
but that seems to require some more discussion before we change it.
Tom Lane [Tue, 12 May 2009 00:56:05 +0000 (00:56 +0000)]
Do some minor code refactoring in preparation for changing the APIs of
find_inheritance_children() and find_all_inheritors(). I got annoyed that
these are buried inside the planner but mostly used elsewhere. So, create
a new file catalog/pg_inherits.c and put them there, along with a couple
of other functions that search pg_inherits.
The code that modifies pg_inherits is (still) in tablecmds.c --- it's
kind of entangled with unrelated code that modifies pg_depend and other
stuff, so pulling it out seemed like a bigger change than I wanted to make
right now. But this file provides a natural home for it if anyone ever
gets around to that.
This commit just moves code around; it doesn't change anything, except
I succumbed to the temptation to make a couple of trivial optimizations
in typeInheritsFrom().
Tom Lane [Mon, 11 May 2009 17:56:08 +0000 (17:56 +0000)]
Partially revert my patch of 2008-11-12 that installed a limit on the number
of AND/OR clause branches that predtest.c would attempt to deal with. As
noted in bug #4721, that change disabled proof attempts for sizes of problems
that people are actually expecting it to work for. The original complaint
it was trying to solve was O(N^2) behavior for long IN-lists, so let's try
applying the limit to just ScalarArrayOpExprs rather than everything.
Another case of "foolish consistency" I fear.
Back-patch to 8.2, same as the previous patch was.
Tom Lane [Sun, 10 May 2009 22:45:28 +0000 (22:45 +0000)]
Make a marginal performance improvement in predicate_implied_by and
predicate_refuted_by: if either top-level input is a single-element list,
reduce it to its lone member before proceeding. This avoids
a useless level of AND-recursion within the recursive proof routines.
It's worth doing because, for example, if the clause is a 100-element
list and the predicate is a 1-element list then we'd otherwise strip
the predicate's list structure 100 times as we iterate through the clause.
It's only needed at top level because there won't be any trivial ANDs below
that --- this situation is an artifact of the decision to represent even
single-item conditions as Lists in the "implicit AND" format, and that format
is only used at the top level of any predicate or restriction condition.
Tom Lane [Sun, 10 May 2009 02:51:44 +0000 (02:51 +0000)]
Adjust pg_dumpall so that it emits ENCODING, LC_COLLATE, and LC_CTYPE options
in its CREATE DATABASE commands only for databases that have settings
different from the installation defaults. This is a low-tech method of
avoiding unnecessary platform dependencies in dump files. Eventually we ought
to have a platform-independent way of specifying LC_COLLATE and LC_CTYPE, but
that's not going to happen for 8.4, and this patch at least avoids the issue
for people who aren't setting up per-database locales. ENCODING doesn't have
the platform dependency problem, but it seems consistent to make it act the
same as the locale settings.
Tom Lane [Sat, 9 May 2009 22:51:41 +0000 (22:51 +0000)]
Fix cost_nestloop and cost_hashjoin to model the behavior of semi and anti
joins a bit better, ie, understand the differing cost functions for matched
and unmatched outer tuples. There is more that could be done in cost_hashjoin
but this already helps a great deal. Per discussions with Robert Haas.
Add alternative expected output files for cs_CZ locale for btree_gist and
tsearch2 tests. This should make 'comet_moth' buildfarm member pass
contrib check. Zdenek Kotala.
Tom Lane [Thu, 7 May 2009 22:58:28 +0000 (22:58 +0000)]
Add an option to AlterTableCreateToastTable() to allow its caller to force
a toast table to be built, even if the sum-of-column-widths calculation
indicates one isn't needed. This is needed by pg_migrator because if the
old table has a toast table, we have to migrate over the toast table since
it might contain some live data, even though subsequent column drops could
mean that no recently-added rows could require toasting.
Tom Lane [Thu, 7 May 2009 22:01:18 +0000 (22:01 +0000)]
Change pgbench to use the table names pgbench_accounts, pgbench_branches,
pgbench_history, and pgbench_tellers, rather than just accounts, branches,
history, and tellers. This is to prevent accidental conflicts with real
application tables, as has been reported to happen at least once. Also
remove the automatic "SET search_path = public" that it did at startup,
as this seems to restrict testing flexibility without actually buying much.
Per proposal by Joshua Drake and ensuing discussion.
Tom Lane [Thu, 7 May 2009 20:13:09 +0000 (20:13 +0000)]
Ooops ... make_outerjoininfo wasn't actually enforcing the join order
restrictions specified for semijoins in optimizer/README, to wit that
you can't reassociate outer joins into or out of the RHS of a semijoin.
Per report from Heikki.
Request XLOG switch before writing checkpoint in pg_start_backup(). Otherwise
you can end up with an unrecoverable backup if you start a new base backup
right after finishing archive recovery. In that scenario, the redo pointer of
the checkpoint that pg_start_backup() writes points to the XLOG segment where
the timeline-changing end-of-archive-recovery checkpoint is. The beginning
of that segment contains pages with the old timeline ID, and we don't accept
that in recovery unless we find a history file covering the old timeline ID.
If you omit pg_xlog from the base backup and clear the archive directory
before starting the backup, there will be no such history file available.
The bug is present in all versions since PITR was introduced in 8.0, but I'm
back-patching only back to 8.2. Earlier versions didn't have XLOG switch
records, making this fix unfeasible. Given the lack of reports until now,
it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1.
Tom Lane [Wed, 6 May 2009 20:31:18 +0000 (20:31 +0000)]
Tweak distribute_qual_to_rels so that when we decide a pseudoconstant qual
can be pushed to the top of the join tree, we update both the relids and
qualscope variables to keep them in sync. This prevents a possible later
failure of an Assert clause, and affects nothing else since qualscope isn't
used later except for that Assert. At the moment the Assert shouldn't be
reachable when we've pushed the qual up; but this is cheap insurance, and
it's more sensible anyway in terms of the overall logic of the routine.
Per analysis of a bug report from Stefan Huehner.
I'm not back-patching this since it's just future-proofing; but if anyone
gets tempted to change check_outerjoin_delay again in the back branches,
this might be needed.
Tom Lane [Wed, 6 May 2009 16:15:21 +0000 (16:15 +0000)]
Modify CREATE DATABASE to enforce that the source database's encoding setting
must be used for the new database, except when copying from template0.
This is the same rule that we now enforce for locale settings, and it has
the same motivation: databases other than template0 might contain data that
would be invalid according to a different setting. This represents another
step in a continuing process of locking down ways in which encoding violations
could occur inside the backend. Per discussion of a few days ago.
In passing, fix pre-existing breakage of mbregress.sh, and fix up a couple
of ereport() calls in dbcommands.c that failed to specify sqlstate codes.