Tom Lane [Tue, 10 Aug 2010 21:51:00 +0000 (21:51 +0000)]
Add three-parameter forms of array_to_string and string_to_array, to allow
better handling of NULL elements within the arrays. The third parameter
is a string that should be used to represent a NULL element, or should
be translated into a NULL element, respectively. If the third parameter
is NULL it behaves the same as the two-parameter form.
There are two incompatible changes in the behavior of the two-parameter form
of string_to_array. First, it will return an empty (zero-element) array
rather than NULL when the input string is of zero length. Second, if the
field separator is NULL, the function splits the string into individual
characters, rather than returning NULL as before. These two changes make
this form fully compatible with the behavior of the new three-parameter form.
Tom Lane [Mon, 9 Aug 2010 18:50:11 +0000 (18:50 +0000)]
Fix incorrect logic in plpgsql for cleanup after evaluation of non-simple
expressions. We need to deal with this when handling subscripts in an array
assignment, and also when catching an exception. In an Assert-enabled build
these omissions led to Assert failures, but I think in a normal build the
only consequence would be short-term memory leakage; which may explain why
this wasn't reported from the field long ago.
Back-patch to all supported versions. 7.4 doesn't have exceptions, but
otherwise these bugs go all the way back.
Tom Lane [Mon, 9 Aug 2010 02:25:07 +0000 (02:25 +0000)]
Modify the handling of RAISE without parameters so that the error it throws
can be caught in the same places that could catch an ordinary RAISE ERROR
in the same location. The previous coding insisted on throwing the error
from the block containing the active exception handler; which is arguably
more surprising, and definitely unlike Oracle's behavior.
Not back-patching, since this is a pretty obscure corner case. The risk
of breaking somebody's code in a minor version update seems to outweigh
any possible benefit.
Tom Lane [Sun, 8 Aug 2010 16:27:06 +0000 (16:27 +0000)]
Add stats functions and views to provide access to a transaction's own
statistics counts. These numbers are being accumulated but haven't yet been
transmitted to the collector (and won't be, until the transaction ends).
For some purposes, though, it's handy to be able to look at them.
Tom Lane [Sat, 7 Aug 2010 02:44:09 +0000 (02:44 +0000)]
Recognize functional dependency on primary keys. This allows a table's
other columns to be referenced without listing them in GROUP BY, so long as
the primary key column(s) are listed in GROUP BY.
Eventually we should also allow functional dependency on a UNIQUE constraint
when the columns are marked NOT NULL, but that has to wait until NOT NULL
constraints are represented in pg_constraint, because we need to have
pg_constraint OIDs for all the conditions needed to ensure functional
dependency.
Peter Eisentraut, reviewed by Alex Hunsaker and Tom Lane
Tom Lane [Thu, 5 Aug 2010 21:45:35 +0000 (21:45 +0000)]
Add a very specific hint for the case that we're unable to locate a function
matching a call like f(x, ORDER BY y,z). It could be that what the user
really wants is f(x,z ORDER BY y). We now have pretty conclusive evidence
that many people won't understand this problem without concrete guidance,
so give it to them. Per further discussion of the string_agg() problem.
Tom Lane [Thu, 5 Aug 2010 18:21:19 +0000 (18:21 +0000)]
Remove the single-argument form of string_agg(). It added nothing much in
functionality, while creating an ambiguity in usage with ORDER BY that at
least two people have already gotten seriously confused by. Also, add an
opr_sanity test to check that we don't in future violate the newly minted
policy of not having built-in aggregates with the same name and different
numbers of parameters. Per discussion of a complaint from Thom Brown.
Robert Haas [Thu, 5 Aug 2010 14:45:09 +0000 (14:45 +0000)]
Standardize get_whatever_oid functions for object types with
unqualified names.
- Add a missing_ok parameter to get_tablespace_oid.
- Avoid duplicating get_tablespace_od guts in objectNamesToOids.
- Add a missing_ok parameter to get_database_oid.
- Replace get_roleid and get_role_checked with get_role_oid.
- Add get_namespace_oid, get_language_oid, get_am_oid.
- Refactor existing code to use new interfaces.
Robert Haas [Wed, 4 Aug 2010 17:33:09 +0000 (17:33 +0000)]
Fix numeric_maximum_size() calculation.
The old computation can sometimes underestimate the necessary space
by 2 bytes; however we're not back-patching this, because this result
isn't used for anything critical. Per discussion with Tom Lane,
make the typmod test in this function match the ones in numeric()
and apply_typmod() exactly.
Tom Lane [Tue, 3 Aug 2010 21:21:03 +0000 (21:21 +0000)]
Replace the naive HYPOT() macro with a standards-conformant hypotenuse
function. This avoids unnecessary overflows and probably gives a more
accurate result as well.
Tom Lane [Tue, 3 Aug 2010 19:24:05 +0000 (19:24 +0000)]
Code review for --quote-all-identifiers patch: add missing --help documentation
for new pg_dump/pg_dumpall parameters, make a couple of trivial stylistic
adjustments to make the code follow usual project style.
Tom Lane [Tue, 3 Aug 2010 18:33:09 +0000 (18:33 +0000)]
Kibitzing on \conninfo patch: adjust the order of field output to match
the parameters of \connect, and fix oversight of not enabling translation
of the messages. Also, adjust \connect's similar messages to match, and
deal with 8.2-era violation of basic translatability guidelines there.
Tom Lane [Tue, 3 Aug 2010 16:31:02 +0000 (16:31 +0000)]
Add some comments to tinterval_cmp_internal pointing out its severe
implementation deficiencies. Per discussion of bug #5592, we're not
going to change it, but these things should be documented so that if
anyone ever reimplements type tinterval, they will be more careful.
Robert Haas [Tue, 3 Aug 2010 15:47:02 +0000 (15:47 +0000)]
Fix inheritance count tracking in ALTER TABLE .. ADD CONSTRAINT.
Without this patch, constraints inherited by children of a parent
table which itself has multiple inheritance parents can end up with
the wrong coninhcount. After dropping the constraint, the children
end up with a leftover copy of the constraint that is not dumped
and cannot be dropped. There is a similar problem with ALTER TABLE
.. ADD COLUMN, but that looks significantly more difficult to
resolve, so I'm committing this fix separately.
Back-patch to 8.4, which is the first release that has coninhcount.
Tom Lane [Tue, 3 Aug 2010 01:50:27 +0000 (01:50 +0000)]
Be a little more careful with the shift computations in QT2QTN and
makeTSQuerySign. The first of these is a live bug, on some platforms,
as per bug #5590 from John Regehr. However the consequences seem limited
because of the relatively narrow scope of use of QTNode.sign. The shift in
makeTSQuerySign is actually safe because TSQS_SIGLEN is unsigned, but it
seems like a good idea to insert an explicit cast rather than depend on that.
Tom Lane [Tue, 3 Aug 2010 00:10:39 +0000 (00:10 +0000)]
Fix core dump in QTNodeCompare when tsquery_cmp() is applied to two empty
tsqueries. CompareTSQ has to have a guard for the case rather than blindly
applying QTNodeCompare to random data past the end of the datums. Also,
change QTNodeCompare to be a little less trusting: use an actual test rather
than just Assert'ing that the input is sane. Problem encountered while
investigating another issue (I saw a core dump in autoanalyze on a table
containing multiple empty tsquery values).
Back-patch to all branches with tsquery support.
In HEAD, also fix some bizarre (though not outright wrong) coding in
tsq_mcontains().
Tom Lane [Mon, 2 Aug 2010 04:51:17 +0000 (04:51 +0000)]
Don't try to force use of -no-cpp-precomp on OS X. It's been five years
since Apple shipped a compiler that needed this switch, and there's
increasing interest in using other compilers that won't accept the switch
at all. Better to let anybody who still needs the switch inject it via
CPPFLAGS. Per gripe from Neil Conway.
Robert Haas [Mon, 2 Aug 2010 03:46:54 +0000 (03:46 +0000)]
Remove ancient PL/pgsql line numbering hack.
While this hack arguably has some benefit in terms of making PL/pgsql's
line numbering match the programmer's expectations, it also makes
PL/pgsql inconsistent with the remaining PLs, making it difficult for
clients to reliably determine where the error actually is. On balance,
it seems better to be consistent.
Tom Lane [Mon, 2 Aug 2010 02:29:39 +0000 (02:29 +0000)]
Tweak a couple of macros in the regex code to suppress compiler warnings
from "clang". The VERR changes make an assignment unconditional, which is
probably easier to read/understand anyway, and one can hardly argue that
it's worth shaving cycles off the case of reporting another error when
one has already been detected. The INSIST change limits where that macro
can be used, but not in a way that creates a problem for any existing call.
Tom Lane [Mon, 2 Aug 2010 01:24:54 +0000 (01:24 +0000)]
Fix an ancient typo that prevented the detection of conflicting fields when
interval input "invalid" was specified together with other fields. Spotted
by Neil Conway with the help of a clang warning. Although this has been
wrong since the interval code was written more than 10 years ago, it doesn't
affect anything beyond which error message you get for a wrong input, so not
worth back-patching very far.
Tom Lane [Sun, 1 Aug 2010 22:38:11 +0000 (22:38 +0000)]
Fix ANALYZE's ancient deficiency of not trying to collect stats for expression
indexes when the index column type (the opclass opckeytype) is different from
the expression's datatype. When coded, this limitation wasn't worth worrying
about because we had no intelligence to speak of in stats collection for the
datatypes used by such opclasses. However, now that there's non-toy
estimation capability for tsvector queries, it amounts to a bug that ANALYZE
fails to do this.
The fix changes struct VacAttrStats, and therefore constitutes an API break
for custom typanalyze functions. Therefore we can't back-patch it into
released branches, but it was agreed that 9.0 isn't yet frozen hard enough
to make such a change unacceptable. Ergo, back-patch to 9.0 but no further.
The API break had better be mentioned in 9.0 release notes.
Tom Lane [Sun, 1 Aug 2010 21:31:08 +0000 (21:31 +0000)]
Add some knowledge about prefix matches to tsmatchsel(). It's not terribly
bright, but it beats assuming that a prefix match behaves identically to an
exact match, which is what the code was doing before :-(. Noted while
experimenting with Artur Dobrowski's example.
Tom Lane [Sun, 1 Aug 2010 19:16:39 +0000 (19:16 +0000)]
Fix an additional set of problems in GIN's handling of lossy page pointers.
Although the key-combining code claimed to work correctly if its input
contained both lossy and exact pointers for a single page in a single TID
stream, in fact this did not work, and could not work without pretty
fundamental redesign. Modify keyGetItem so that it will not return such a
stream, by handling lossy-pointer cases a bit more explicitly than we did
before.
Per followup investigation of a gripe from Artur Dabrowski.
An example of a query that failed given his data set is
select count(*) from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* | dd:*')) and
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'aa:*'));
Back-patch to 8.4 where the lossy pointer code was introduced.
Tom Lane [Sun, 1 Aug 2010 02:12:42 +0000 (02:12 +0000)]
Rewrite the rbtree routines so that an RBNode is the first field of the
struct representing a tree entry, rather than being a separately allocated
piece of storage. This API is at least as clean as the old one (if not
more so --- there were some bizarre choices in there) and it permits a
very substantial memory savings, on the order of 2X in ginbulk.c's usage.
Also, fix minor memory leaks in code called by ginEntryInsert, in
particular in ginInsertValue and entryFillRoot, as well as ginEntryInsert
itself. These leaks resulted in the GIN index build context continuing
to bloat even after we'd filled it to maintenance_work_mem and started
to dump data out to the index.
In combination these fixes restore the GIN index build code to honoring
the maintenance_work_mem limit about as well as it did in 8.4. Speed
seems on par with 8.4 too, maybe even a bit faster, for a non-pathological
case in which HEAD was formerly slower.
Back-patch to 9.0 so we don't have a performance regression from 8.4.
Tom Lane [Sat, 31 Jul 2010 03:27:40 +0000 (03:27 +0000)]
Tweak tsmatchsel() so that it examines the structure of the tsquery whenever
possible (ie, whenever the tsquery is a constant), even when no statistics
are available for the tsvector. For example, foo @@ 'a & b'::tsquery
can be expected to be more selective than foo @@ 'a'::tsquery, whether
or not we know anything about foo. We use DEFAULT_TS_MATCH_SEL as the assumed
selectivity of individual query terms when no stats are available, then
combine the terms according to the query's AND/OR structure as usual.
Per experimentation with Artur Dabrowski's example. (The fact that there
are no stats available in that example is a problem in itself, but
nonetheless tsmatchsel should be smarter about the case.)
Back-patch to 8.4 to keep all versions of tsmatchsel() in sync.
Tom Lane [Sat, 31 Jul 2010 00:30:54 +0000 (00:30 +0000)]
Rewrite the key-combination logic in GIN's keyGetItem() and scanGetItem()
routines to make them behave better in the presence of "lossy" index pointers.
The previous coding was outright incorrect for some cases, as recently
reported by Artur Dabrowski: scanGetItem would fail to return index entries in
cases where one index key had multiple exact pointers on the same page as
another key had a lossy pointer. Also, keyGetItem was extremely inefficient
for cases where a single index key generates multiple "entry" streams, such as
an @@ operator with a multiple-clause tsquery. The presence of a lossy page
pointer in any one stream defeated its ability to use the opclass
consistentFn, resulting in probing many heap pages that didn't really need to
be visited. In Artur's example case, a query like
WHERE tsvector @@ to_tsquery('a & b')
was about 50X slower than the theoretically equivalent
WHERE tsvector @@ to_tsquery('a') AND tsvector @@ to_tsquery('b')
The way that I chose to fix this was to have GIN call the consistentFn
twice with both TRUE and FALSE values for the in-doubt entry stream,
returning a hit if either call produces TRUE, but not if they both return
FALSE. The code handles this for the case of a single in-doubt entry stream,
but punts (falling back to the stupid behavior) if there's more than one lossy
reference to the same page. The idea could be scaled up to deal with multiple
lossy references, but I think that would probably be wasted complexity. At
least to judge by Artur's example, such cases don't occur often enough to be
worth trying to optimize.
Back-patch to 8.4. 8.3 did not have lossy GIN index pointers, so not
subject to these problems.
Tom Lane [Thu, 29 Jul 2010 23:16:33 +0000 (23:16 +0000)]
Improved version of patch to protect pg_get_expr() against misuse:
look through join alias Vars to avoid breaking join queries, and
move the test to someplace where it will catch more possible ways
of calling a function. We still ought to throw away the whole thing
in favor of a data-type-based solution, but that's not feasible in
the back branches.
This needs to be back-patched further than 9.0, but I don't have time
to do so today. Committing now so that the fix gets into 9.0beta4.
Simon Riggs [Thu, 29 Jul 2010 22:27:27 +0000 (22:27 +0000)]
Rename asyncCommitLSN to asyncXactLSN to reflect changed role in 9.0.
Transaction aborts now record their LSN to avoid corner case
behaviour in SR/HS, hence change of name of variables and functions.
As pointed out by Fujii Masao. Cosmetic changes only.
Tom Lane [Thu, 29 Jul 2010 20:09:25 +0000 (20:09 +0000)]
Clean up some inconsistencies in the volatility marking of various I/O
related functions. Per today's discussion, we will henceforth assume
that datatype I/O functions are either stable or immutable, never volatile.
(This implies in particular that domain CHECK constraint expressions shouldn't
be volatile, since domain_in executes them.) In turn, functions that execute
the I/O functions of arbitrary datatypes should always be labeled stable.
This affects the labeling of array_to_string, which was unsafely marked
immutable, and record_in, record_out, record_recv, record_send,
domain_in, domain_recv, which were over-conservatively marked volatile.
The array I/O functions were already marked stable, which is correct
per this policy but would have been wrong if we maintained domain_in
as volatile.
Back-patch to 9.0, along with an earlier fix to correctly mark cash_in
and cash_out as stable not immutable (since they depend on lc_monetary).
No catversion bump --- the implications of this are not currently
severe enough to justify a forced initdb.
Peter Eisentraut [Thu, 29 Jul 2010 19:34:41 +0000 (19:34 +0000)]
Fix indentation of verbatim block elements
Block elements with verbatim formatting (literallayout, programlisting,
screen, synopsis) should be aligned at column 0 independent of the surrounding
SGML, because whitespace is significant, and indenting them creates erratic
whitespace in the output. The CSS stylesheets already take care of indenting
the output.
Tom Lane [Thu, 29 Jul 2010 19:23:20 +0000 (19:23 +0000)]
Fix another longstanding problem in copy_relation_data: it was blithely
assuming that a local char[] array would be aligned on at least a word
boundary. There are architectures on which that is pretty much guaranteed to
NOT be the case ... and those arches also don't like non-aligned memory
accesses, meaning that log_newpage() would crash if it ever got invoked.
Even on Intel-ish machines there's a potential for a large performance penalty
from doing I/O to an inadequately aligned buffer. So palloc it instead.
Tom Lane [Thu, 29 Jul 2010 18:29:52 +0000 (18:29 +0000)]
Work around a documentation toolchain problem by replacing the "AIX-fixlevels"
table with a <variablelist> carrying the same information. Previously the
9.0 documentation was failing to build as a US-size PDF file. It's quite
obscure what the real problem is or why this avoids it, but we need a hack
now so we can build docs for beta4.
In passing do a bit of editing in the AIX installation docs, in particular
remove a long-obsolete claim that the regression tests are likely to fail.
Robert Haas [Thu, 29 Jul 2010 16:14:36 +0000 (16:14 +0000)]
Fix possible page corruption by ALTER TABLE .. SET TABLESPACE.
If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
will do the same thing during replay, so the corruption propagates to any
standby. Note, however, that the bug can't be demonstrated unless archiving
is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
are not set.
Back-patch to 8.0; prior releases do not have tablespaces.
Analysis and patch by Jeff Davis. Adjustments for back-branches and minor
wordsmithing by me.
Simon Riggs [Thu, 29 Jul 2010 11:06:34 +0000 (11:06 +0000)]
Add explicit regression tests for ALTER TABLE lock levels.
Use this to catch a couple of lock level assignments that slipped
through manual testing, per Peter Eisentraut.
Tom Lane [Wed, 28 Jul 2010 17:21:56 +0000 (17:21 +0000)]
Fix oversight in new EvalPlanQual logic: the second loop over the ExecRowMark
list in ExecLockRows() forgot to allow for the possibility that some of the
rowmarks are for child tables that aren't relevant to the current row.
Per report from Kenichiro Tanaka.
Simon Riggs [Wed, 28 Jul 2010 05:22:24 +0000 (05:22 +0000)]
Reduce lock levels of CREATE TRIGGER and some ALTER TABLE, CREATE RULE actions.
Avoid hard-coding lockmode used for many altering DDL commands, allowing easier
future changes of lock levels. Implementation of initial analysis on DDL
sub-commands, so that many lock levels are now at ShareUpdateExclusiveLock or
ShareRowExclusiveLock, allowing certain DDL not to block reads/writes.
First of number of planned changes in this area; additional docs required
when full project complete.
Tom Lane [Wed, 28 Jul 2010 04:50:50 +0000 (04:50 +0000)]
Fix potential failure when hashing the output of a subplan that produces
a pass-by-reference datatype with a nontrivial projection step.
We were using the same memory context for the projection operation as for
the temporary context used by the hashtable routines in execGrouping.c.
However, the hashtable routines feel free to reset their temp context at
any time, which'd lead to destroying input data that was still needed.
Report and diagnosis by Tao Ma.
Back-patch to 8.1, where the problem was introduced by the changes that
allowed us to work with "virtual" tuples instead of materializing intermediate
tuple values everywhere. The earlier code looks quite similar, but it doesn't
suffer the problem because the data gets copied into another context as a
result of having to materialize ExecProject's output tuple.
Bruce Momjian [Sun, 25 Jul 2010 03:28:32 +0000 (03:28 +0000)]
Prevent pg_upgrade from migrating databases that use reg* data types
where the oid is not preserved by pg_upgrade (everything but pg_type).
Update documentation.
Robert Haas [Thu, 22 Jul 2010 01:22:35 +0000 (01:22 +0000)]
Add options to force quoting of all identifiers.
I've added a quote_all_identifiers GUC which affects the behavior
of the backend, and a --quote-all-identifiers argument to pg_dump
and pg_dumpall which sets the GUC and also affects the quoting done
internally by those applications.
Design by Tom Lane; review by Alex Hunsaker; in response to bug #5488
filed by Hartmut Goebel.
Robert Haas [Thu, 22 Jul 2010 00:47:59 +0000 (00:47 +0000)]
Centralize DML permissions-checking logic.
Remove bespoke code in DoCopy and RI_Initial_Check, which now instead
fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry.
This is intended to make it feasible for an enhanced security provider
to actually make use of ExecutorCheckPerms_hook, but also has the
advantage that RI_Initial_Check can allow use of the fast-path when
column-level but not table-level permissions are present.
KaiGai Kohei. Reviewed (in an earlier version) by Stephen Frost, and by me.
Some further changes to the comments by me.
Robert Haas [Tue, 20 Jul 2010 14:14:30 +0000 (14:14 +0000)]
Have \conninfo mention the port even for local sockets.
Per discussion with David Christensen, there can be multiple
instances of PG accessible via local sockets, and you need the port
to see which one you're actually connected to. David's original
patch worked this way, but I inadvertently ripped it out during
commit.
Robert Haas [Tue, 20 Jul 2010 00:47:53 +0000 (00:47 +0000)]
Add restart_after_crash GUC.
Normally, we automatically restart after a backend crash, but in some
cases when PostgreSQL is invoked by clusterware it may be desirable to
suppress this behavior, so we provide an option which does this.
Since no existing GUC group quite fits, create a new group called
"error handling options" for this and the previously undocumented GUC
exit_on_error, which is now documented.
Tom Lane [Sun, 18 Jul 2010 23:43:32 +0000 (23:43 +0000)]
Remove unnecessary "Not safe to send CSV data" complaint from elog.c's fallback
path when CSV logging is configured but not yet operational. It's sufficient
to send the message to stderr, as we were already doing, and the "Not safe"
gripe has already confused at least two core members ...
Backpatch to 9.0, but not further --- doesn't seem appropriate to change
this behavior in stable branches.