Tom Lane [Thu, 26 May 2011 23:25:19 +0000 (19:25 -0400)]
Make decompilation of optimized CASE constructs more robust.
We had some hacks in ruleutils.c to cope with various odd transformations
that the optimizer could do on a CASE foo WHEN "CaseTestExpr = RHS" clause.
However, the fundamental impossibility of covering all cases was exposed
by Heikki, who pointed out that the "=" operator could get replaced by an
inlined SQL function, which could contain nearly anything at all. So give
up on the hacks and just print the expression as-is if we fail to recognize
it as "CaseTestExpr = RHS". (We must cover that case so that decompiled
rules print correctly; but we are not under any obligation to make EXPLAIN
output be 100% valid SQL in all cases, and already could not do so in some
other cases.) This approach requires that we have some printable
representation of the CaseTestExpr node type; I used "CASE_TEST_EXPR".
Back-patch to all supported branches, since the problem case fails in all.
Tom Lane [Mon, 23 May 2011 16:53:05 +0000 (12:53 -0400)]
Install defenses against overflow in BuildTupleHashTable().
The planner can sometimes compute very large values for numGroups, and in
cases where we have no alternative to building a hashtable, such a value
will get fed directly to BuildTupleHashTable as its nbuckets parameter.
There were two ways in which that could go bad. First, BuildTupleHashTable
declared the parameter as "int" but most callers were passing "long"s,
so on 64-bit machines undetected overflow could occur leading to a bogus
negative value. The obvious fix for that is to change the parameter to
"long", which is what I've done in HEAD. In the back branches that seems a
bit risky, though, since third-party code might be calling this function.
So for them, just put in a kluge to treat negative inputs as INT_MAX.
Second, hash_create can go nuts with extremely large requested table sizes
(notably, my_log2 becomes an infinite loop for inputs larger than
LONG_MAX/2). What seems most appropriate to avoid that is to bound the
initial table size request to work_mem.
This fixes bug #6035 reported by Daniel Schreiber. Although the reported
case only occurs back to 8.4 since it involves WITH RECURSIVE, I think
it's a good idea to install the defenses in all supported branches.
Tom Lane [Thu, 12 May 2011 15:56:38 +0000 (11:56 -0400)]
Fix write-past-buffer-end in ldapServiceLookup().
The code to assemble ldap_get_values_len's output into a single string
wrote the terminating null one byte past where it should. Fix that,
and make some other cosmetic adjustments to make the code a trifle more
readable and more in line with usual Postgres coding style.
Also, free the "result" string when done with it, to avoid a permanent
memory leak.
Bug report and patch by Albe Laurenz, cosmetic adjustments by me.
Tom Lane [Fri, 29 Apr 2011 05:45:27 +0000 (01:45 -0400)]
Rewrite pg_size_pretty() to avoid compiler bug.
Convert it to use successive shifts right instead of increasing a divisor.
This is probably a tad more efficient than the original coding, and it's
nicer-looking than the previous patch because we don't need a special case
to avoid overflow in the last branch. But the real reason to do it is to
avoid a Solaris compiler bug, as per results from buildfarm member moa.
Tom Lane [Wed, 27 Apr 2011 17:58:59 +0000 (13:58 -0400)]
Fix array- and path-creating functions to ensure padding bytes are zeroes.
Per recent discussion, it's important for all computed datums (not only the
results of input functions) to not contain any ill-defined (uninitialized)
bits. Failing to ensure that can result in equal() reporting that
semantically indistinguishable Consts are not equal, which in turn leads to
bizarre and undesirable planner behavior, such as in a recent example from
David Johnston. We might eventually try to fix this in a general manner by
allowing datatypes to define identity-testing functions, but for now the
path of least resistance is to expect datatypes to force all unused bits
into consistent states.
Per some testing by Noah Misch, array and path functions seem to be the
only ones presenting risks at the moment, so I looked through all the
functions in adt/array*.c and geo_ops.c and fixed them as necessary. In
the array functions, the easiest/safest fix is to allocate result arrays
with palloc0 instead of palloc. Possibly in future someone will want to
look into whether we can just zero the padding bytes, but that looks too
complex for a back-patchable fix. In the path functions, we already had a
precedent in path_in for just zeroing the one known pad field, so duplicate
that code as needed.
Tom Lane [Mon, 25 Apr 2011 20:22:28 +0000 (16:22 -0400)]
Fix pg_size_pretty() to avoid overflow for inputs close to INT64_MAX.
The expression that tried to round the value to the nearest TB could
overflow, leading to bogus output as reported in bug #5993 from Nicola
Cossu. This isn't likely to ever happen in the intended usage of the
function (if it could, we'd be needing to use a wider datatype instead);
but it's not hard to give the expected output, so let's do so.
On IA64 architecture, we check the depth of the register stack in addition
to the regular stack. The code to do that is platform and compiler specific,
add support for the HP-UX native compiler.
Tom Lane [Thu, 7 Apr 2011 19:15:00 +0000 (15:15 -0400)]
Modernize dlopen interface code for FreeBSD and OpenBSD.
Remove the hard-wired assumption that __mips__ (and only __mips__) lacks
dlopen in FreeBSD and OpenBSD. This assumption is outdated at least for
OpenBSD, as per report from an anonymous 9.1 tester. We can perfectly well
use HAVE_DLOPEN instead to decide which code to use.
Some other cosmetic adjustments to make freebsd.c, netbsd.c, and openbsd.c
exactly alike.
Tom Lane [Thu, 7 Apr 2011 15:40:44 +0000 (11:40 -0400)]
Fix SortTocFromFile() to cope with lines that are too long for its buffer.
The original coding supposed that a dump TOC file could never contain lines
longer than 1K. The folly of that was exposed by a recent report from
Per-Olov Esgard. We only really need to see the first dozen or two bytes
of each line, since we're just trying to read off the numeric ID at the
start of the line; so there's no need for a particularly huge buffer.
What there is a need for is logic to not process continuation bufferloads.
Back-patch to all supported branches, since it's always been like this.
Tom Lane [Wed, 23 Mar 2011 20:57:41 +0000 (16:57 -0400)]
Improve user-defined-aggregates documentation.
On closer inspection, that two-element initcond value seems to have been
a little white lie to avoid explaining the full behavior of float8_accum.
But if people are going to expect the examples to be exactly correct,
I suppose we'd better explain. Per comment from Thom Brown.
Tom Lane [Tue, 22 Mar 2011 17:01:23 +0000 (13:01 -0400)]
Avoid potential deadlock in InitCatCachePhase2().
Opening a catcache's index could require reading from that cache's own
catalog, which of course would acquire AccessShareLock on the catalog.
So the original coding here risks locking index before heap, which could
deadlock against another backend trying to get exclusive locks in the
normal order. Because InitCatCachePhase2 is only called when a backend
has to start up without a relcache init file, the deadlock was seldom seen
in the field. (And by the same token, there's no need to worry about any
performance disadvantage; so not much point in trying to distinguish
exactly which catalogs have the risk.)
Bug report, diagnosis, and patch by Nikhil Sontakke. Additional commentary
by me. Back-patch to all supported branches.
Andrew Dunstan [Thu, 17 Mar 2011 04:24:04 +0000 (00:24 -0400)]
Use correct PATH separator for Cygwin in pg_regress.c.
This has been broken for years, and I'm not sure why it has not been
noticed before, but now a very modern Cygwin breaks on it, and the fix
is clearly correct. Backpatching to all live branches.
Tom Lane [Fri, 11 Mar 2011 23:19:11 +0000 (18:19 -0500)]
Put in some more safeguards against executing a division-by-zero.
Add dummy returns before every potential division-by-zero in int8.c,
because apparently further "improvements" in gcc's optimizer have
enabled it to break functions that weren't broken before.
Tom Lane [Tue, 22 Feb 2011 02:18:30 +0000 (21:18 -0500)]
Fix dangling-pointer problem in before-row update trigger processing.
ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new
tuple value by seeing if the returned tuple was pointer-equal to the old
one. But the "old one" was in estate->es_junkFilter's result slot, which
would be scribbled on if we had done an EvalPlanQual update in response to
a concurrent update of the target tuple; therefore we were comparing a
dangling pointer to a live one. Given the right set of circumstances we
could get a false match, resulting in not forcing the tuple to be stored in
the slot we thought it was stored in. In the case reported by Maxim Boguk
in bug #5798, this led to "cannot extract system attribute from virtual
tuple" failures when trying to do "RETURNING ctid". I believe there is a
very-low-probability chance of more serious errors, such as generating
incorrect index entries based on the original rather than the
trigger-modified version of the row.
In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers,
ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to
have similar APIs. In the back branches I just changed
ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers
case.
Tom Lane [Tue, 15 Feb 2011 20:50:17 +0000 (15:50 -0500)]
Add CheckTableNotInUse calls in DROP TABLE and DROP INDEX.
Recent releases had a check on rel->rd_refcnt in heap_drop_with_catalog,
but failed to cover the possibility of pending trigger events at DROP time.
(Before 8.4 we didn't even check the refcnt.) When the trigger events were
eventually fired, you'd get "could not open relation with OID nnn" errors,
as in recent report from strk. Better to throw a suitable error when the
DROP is attempted.
Tom Lane [Thu, 27 Jan 2011 22:42:00 +0000 (17:42 -0500)]
Prevent buffer overrun while parsing an integer in a "query_int" value.
contrib/intarray's gettoken() uses a fixed-size buffer to collect an
integer's digits, and did not guard against overrunning the buffer.
This is at least a backend crash risk, and in principle might allow
arbitrary code execution. The code didn't check for overflow of the
integer value either, which while not presenting a crash risk was still
bad.
Thanks to Apple Inc's security team for reporting this issue and supplying
the fix.
Tom Lane [Thu, 27 Jan 2011 21:27:27 +0000 (16:27 -0500)]
Don't include <asm/ia64regs.h> unnecessarily.
We only need that header when compiling with icc, since the gcc variant of
ia64_get_bsp() uses in-line assembly code. Per report from Frank Brendel,
the header doesn't exist on all IA64 platforms; so don't include it unless
we need it.
Tom Lane [Fri, 21 Jan 2011 21:22:35 +0000 (16:22 -0500)]
Fix pg_restore to do the right thing when escaping large objects.
Specifically, this makes the workflow pg_dump -Fc -> pg_restore -> file
produce correct output for BLOBs when the source database has
standard_conforming_strings turned on. It was already okay when that was
off, or if pg_restore was told to restore directly into a database.
This is a back-port of commit b1732111f233bbb72788e92a627242ec28a85631 of
2009-08-04, with additional changes to emit old-style escaped bytea data
instead of hex-style. At the time, we had not heard of anyone encountering
the problem in the field, so I judged it not worth the risk of changing
back branches. Now we do have a report, from Bosco Rama, so back-patch
into 8.2 through 8.4. 9.0 and up are okay already.
Tom Lane [Mon, 17 Jan 2011 17:38:52 +0000 (12:38 -0500)]
Fix miscalculation of itemsafter in array_set_slice().
If the slice to be assigned to was before the existing array lower bound
(requiring at least one null element to spring into existence to fill the
gap), the code miscalculated how many entries needed to be copied from
the old array's null bitmap. This could result in trashing the array's
data area (as seen in bug #5840 from Karsten Loesing), or worse.
This has been broken since we first allowed the behavior of assigning to
non-adjacent slices, in 8.2. Back-patch to all affected versions.
Andrew Dunstan [Tue, 4 Jan 2011 14:41:18 +0000 (09:41 -0500)]
Work around header misdefines in modern Windows SDK when _WIN32_WINNT is less than 0x0501. Only required for versions 8.2, 8.3 and 8.4., as we defined _WIN32_WINNT as 0x0501 after that.
Tom Lane [Wed, 29 Dec 2010 03:49:57 +0000 (22:49 -0500)]
Avoid unexpected conversion overflow in planner for distant date values.
The "date" type supports a wider range of dates than int64 timestamps do.
However, there is pre-int64-timestamp code in the planner that assumes that
all date values can be converted to timestamp with impunity. Fortunately,
what we really need out of the conversion is always a double (float8)
value; so even when the date is out of timestamp's range it's possible to
produce a sane answer. All we need is a code path that doesn't try to
force the result into int64. Per trouble report from David Rericha.
Back-patch to all supported versions. Although this is surely a corner
case, there's not much point in advertising a date range wider than
timestamp's if we will choke on such values in unexpected places.
Tom Lane [Sun, 19 Dec 2010 20:30:44 +0000 (15:30 -0500)]
Fix up handling of simple-form CASE with constant test expression.
eval_const_expressions() can replace CaseTestExprs with constants when
the surrounding CASE's test expression is a constant. This confuses
ruleutils.c's heuristic for deparsing simple-form CASEs, leading to
Assert failures or "unexpected CASE WHEN clause" errors. I had put in
a hack solution for that years ago (see commit 514ce7a331c5bea8e55b106d624e55732a002295 of 2006-10-01), but bug #5794
from Peter Speck shows that that solution failed to cover all cases.
Fortunately, there's a much better way, which came to me upon reflecting
that Peter's "CASE TRUE WHEN" seemed pretty redundant: we can "simplify"
the simple-form CASE to the general form of CASE, by simply omitting the
constant test expression from the rebuilt CASE construct. This is
intuitively valid because there is no need for the executor to evaluate
the test expression at runtime; it will never be referenced, because any
CaseTestExprs that would have referenced it are now replaced by constants.
This won't save a whole lot of cycles, since evaluating a Const is pretty
cheap, but a cycle saved is a cycle earned. In any case it beats kluging
ruleutils.c still further. So this patch improves const-simplification
and reverts the previous change in ruleutils.c.
Back-patch to all supported branches. The bug exists in 8.1 too, but it's
out of warranty.
After parsing a parenthesized subexpression, we must pop all pending
ANDs and NOTs off the stack, just like the case for a simple operand.
Per bug #5793.
Also fix clones of this routine in contrib/intarray and contrib/ltree,
where input of types query_int and ltxtquery had the same problem.
Tom Lane [Thu, 16 Dec 2010 04:51:07 +0000 (23:51 -0500)]
Fix up getopt() reset management so it works on recent mingw.
The mingw people don't appear to care about compatibility with non-GNU
versions of getopt, so force use of our own copy of getopt on Windows.
Also, ensure that we make use of optreset when using our own copy.
Per report from Andrew Dunstan. Back-patch to all versions supported
on Windows.
Tom Lane [Thu, 9 Dec 2010 01:01:29 +0000 (20:01 -0500)]
Force default wal_sync_method to be fdatasync on Linux.
Recent versions of the Linux system header files cause xlogdefs.h to
believe that open_datasync should be the default sync method, whereas
formerly fdatasync was the default on Linux. open_datasync is a bad
choice, first because it doesn't actually outperform fdatasync (in fact
the reverse), and second because we try to use O_DIRECT with it, causing
failures on certain filesystems (e.g., ext4 with data=journal option).
This part of the patch is largely per a proposal from Marti Raudsepp.
More extensive changes are likely to follow in HEAD, but this is as much
change as we want to back-patch.
Also clean up confusing code and incorrect documentation surrounding the
fsync_writethrough option. Those changes shouldn't result in any actual
behavioral change, but I chose to back-patch them anyway to keep the
branches looking similar in this area.
In 9.0 and HEAD, also do some copy-editing on the WAL Reliability
documentation section.
Back-patch to all supported branches, since any of them might get used
on modern Linux versions.
Tom Lane [Tue, 7 Dec 2010 03:56:07 +0000 (22:56 -0500)]
Add a stack overflow check to copyObject().
There are some code paths, such as SPI_execute(), where we invoke
copyObject() on raw parse trees before doing parse analysis on them. Since
the bison grammar is capable of building heavily nested parsetrees while
itself using only minimal stack depth, this means that copyObject() can be
the front-line function that hits stack overflow before anything else does.
Accordingly, it had better have a check_stack_depth() call. I did a bit of
performance testing and found that this slows down copyObject() by only a
few percent, so the hit ought to be negligible in the context of complete
processing of a query.
Per off-list report from Toshihide Katayama. Back-patch to all supported
branches.
Tom Lane [Wed, 1 Dec 2010 05:53:39 +0000 (00:53 -0500)]
Prevent inlining a SQL function with multiple OUT parameters.
There were corner cases in which the planner would attempt to inline such
a function, which would result in a failure at runtime due to loss of
information about exactly what the result record type is. Fix by disabling
inlining when the function's recorded result type is RECORD. There might
be some sub-cases where inlining could still be allowed, but this is a
simple and backpatchable fix, so leave refinements for another day.
Per bug #5777 from Nate Carson.
Back-patch to all supported branches. 8.1 happens to avoid a core-dump
here, but it still does the wrong thing.
Tom Lane [Fri, 26 Nov 2010 20:21:08 +0000 (15:21 -0500)]
Fix significant memory leak in contrib/xml2 functions.
Most of the functions that execute XPath queries leaked the data structures
created by libxml2. This memory would not be recovered until end of
session, so it mounts up pretty quickly in any serious use of the feature.
Per report from Pavel Stehule, though this isn't his patch.
The GiST scan algorithm uses LSNs to detect concurrent pages splits, but
temporary indexes are not WAL-logged. We used a constant LSN for temporary
indexes, on the assumption that we don't need to worry about concurrent page
splits in temporary indexes because they're only visible to the current
session. But that assumption is wrong, it's possible to insert rows and
split pages in the same session, while a scan is in progress. For example,
by opening a cursor and fetching some rows, and INSERTing new rows before
fetching some more.
Fix by generating fake increasing LSNs, used in place of real LSNs in
temporary GiST indexes.
Tom Lane [Mon, 15 Nov 2010 19:27:12 +0000 (14:27 -0500)]
Fix aboriginal mistake in plpython's set-returning-function support.
We must stay in the function's SPI context until done calling the iterator
that returns the set result. Otherwise, any attempt to invoke SPI features
in the python code called by the iterator will malfunction. Diagnosis and
patch by Jan Urbanski, per bug report from Jean-Baptiste Quenot.
Back-patch to 8.2; there was no support for SRFs in previous versions of
plpython.
Tom Lane [Sat, 13 Nov 2010 05:35:08 +0000 (00:35 -0500)]
Add missing outfuncs.c support for struct InhRelation.
This is needed to support debug_print_parse, per report from Jon Nelson.
Cursory testing via the regression tests suggests we aren't missing
anything else.
Tom Lane [Fri, 12 Nov 2010 20:14:51 +0000 (15:14 -0500)]
Fix old oversight in const-simplification of COALESCE() expressions.
Once we have found a non-null constant argument, there is no need to
examine additional arguments of the COALESCE. The previous coding got it
right only if the constant was in the first argument position; otherwise
it tried to simplify following arguments too, leading to unexpected
behavior like this:
regression=# select coalesce(f1, 42, 1/0) from int4_tbl;
ERROR: division by zero
It's a minor corner case, but a bug is a bug, so back-patch all the way.
Fix bug introduced by the recent patch to check that the checkpoint redo
location read from backup label file can be found: wasShutdown was set
incorrectly when a backup label file was found.
Tom Lane [Wed, 10 Nov 2010 21:51:39 +0000 (16:51 -0500)]
Fix line_construct_pm() for the case of "infinite" (DBL_MAX) slope.
This code was just plain wrong: what you got was not a line through the
given point but a line almost indistinguishable from the Y-axis, although
not truly vertical. The only caller that tries to use this function with
m == DBL_MAX is dist_ps_internal for the case where the lseg is horizontal;
it would end up producing the distance from the given point to the place
where the lseg's line crosses the Y-axis. That function is used by other
operators too, so there are several operators that could compute wrong
distances from a line segment to something else. Per bug #5745 from
jindiax.
Tom Lane [Tue, 9 Nov 2010 16:28:18 +0000 (11:28 -0500)]
Repair memory leakage while ANALYZE-ing complex index expressions.
The general design of memory management in Postgres is that intermediate
results computed by an expression are not freed until the end of the tuple
cycle. For expression indexes, ANALYZE has to re-evaluate each expression
for each of its sample rows, and it wasn't bothering to free intermediate
results until the end of processing of that index. This could lead to very
substantial leakage if the intermediate results were large, as in a recent
example from Jakub Ouhrabka. Fix by doing ResetExprContext for each sample
row. This necessitates adding a datumCopy step to ensure that the final
expression value isn't recycled too. Some quick testing suggests that this
change adds at worst about 10% to the time needed to analyze a table with
an expression index; which is annoying, but seems a tolerable price to pay
to avoid unexpected out-of-memory problems.
Tom Lane [Sun, 7 Nov 2010 02:59:26 +0000 (22:59 -0400)]
Add support for detecting register-stack overrun on IA64.
Per recent investigation, the register stack can grow faster than the
regular stack depending on compiler and choice of options. To avoid
crashes we must check both stacks in check_stack_depth().
Tom Lane [Wed, 3 Nov 2010 17:42:19 +0000 (13:42 -0400)]
Reduce recursion depth in recently-added regression test.
Some buildfarm members fail the test with the original depth of 10 levels,
apparently because they are running at the minimum max_stack_depth setting
of 100kB and using ~ 10k per recursion level. While it might be
interesting to try to figure out why they're eating so much stack, it isn't
likely that any fix for that would be back-patchable. So just change the
test to recurse only 5 levels. The extra levels don't prove anything
correctness-wise anyway.
Tom Lane [Tue, 2 Nov 2010 21:15:29 +0000 (17:15 -0400)]
Ensure an index that uses a whole-row Var still depends on its table.
We failed to record any dependency on the underlying table for an index
declared like "create index i on t (foo(t.*))". This would create trouble
if the table were dropped without previously dropping the index. To fix,
simplify some overly-cute code in index_create(), accepting the possibility
that sometimes the whole-table dependency will be redundant. Also document
this hazard in dependency.c. Per report from Kevin Grittner.
In passing, prevent a core dump in pg_get_indexdef() if the index's table
can't be found. I came across this while experimenting with Kevin's
example. Not sure it's a real issue when the catalogs aren't corrupt, but
might as well be cautious.
Tom Lane [Thu, 28 Oct 2010 17:01:23 +0000 (13:01 -0400)]
Fix plpgsql's handling of "simple" expression evaluation.
In general, expression execution state trees aren't re-entrantly usable,
since functions can store private state information in them.
For efficiency reasons, plpgsql tries to cache and reuse state trees for
"simple" expressions. It can get away with that most of the time, but it
can fail if the state tree is dirty from a previous failed execution (as
in an example from Alvaro) or is being used recursively (as noted by me).
Fix by tracking whether a state tree is in use, and falling back to the
"non-simple" code path if so. This results in a pretty considerable speed
hit when the non-simple path is taken, but the available alternatives seem
even more unpleasant because they add overhead in the simple path. Per
idea from Heikki.
Before removing backup_label and irrevocably changing pg_control file, check
that WAL file containing the checkpoint redo-location can be found. This
avoids making the cluster irrecoverable if the redo location is in an earlie
WAL file than the checkpoint record.
Report, analysis and patch by Jeff Davis, with small changes by me.
Tom Lane [Wed, 20 Oct 2010 04:55:15 +0000 (00:55 -0400)]
Fix ecpg test building process to not generate *.dSYM junk on Macs.
The trick is to not try to build executables directly from .c files,
but to always build the intermediate .o files. For obscure reasons,
Darwin's version of gcc will leave debug cruft behind in the first
case but not the second. Per complaint from Robert Haas.
Tom Lane [Mon, 11 Oct 2010 23:05:04 +0000 (19:05 -0400)]
Fix assorted bugs in GIN's WAL replay logic.
The original coding was quite sloppy about handling the case where
XLogReadBuffer fails (because the page has since been deleted). This
would result in either "bad buffer id: 0" or an Assert failure during
replay, if indeed the page were no longer there. In a couple of places
it also neglected to check whether the change had already been applied,
which would probably result in corrupted index contents. I believe that
bug #5703 is an instance of the first problem. These issues could show up
without replication, but only if you were unfortunate enough to crash
between modification of a GIN index and the next checkpoint.
Back-patch to 8.2, which is as far back as GIN has WAL support.
Tom Lane [Sun, 3 Oct 2010 00:02:52 +0000 (20:02 -0400)]
Behave correctly if INSERT ... VALUES is decorated with additional clauses.
In versions 8.2 and up, the grammar allows attaching ORDER BY, LIMIT,
FOR UPDATE, or WITH to VALUES, and hence to INSERT ... VALUES. But the
special-case code for VALUES in transformInsertStmt() wasn't expecting any
of those, and just ignored them, leading to unexpected results. Rather
than complicate the special-case path, just ensure that the presence of any
of those clauses makes us treat the query as if it had a general SELECT.
Per report from Hitoshi Harada.
Tom Lane [Thu, 30 Sep 2010 21:21:30 +0000 (17:21 -0400)]
Use a separate interpreter for each calling SQL userid in plperl and pltcl.
There are numerous methods by which a Perl or Tcl function can subvert
the behavior of another such function executed later; for example, by
redefining standard functions or operators called by the target function.
If the target function is SECURITY DEFINER, or is called by such a
function, this means that any ordinary SQL user with Perl or Tcl language
usage rights can do essentially anything with the privileges of the target
function's owner.
To close this security hole, create a separate Perl or Tcl interpreter for
each SQL userid under which plperl or pltcl functions are executed within
a session. However, all plperlu or pltclu functions run within a session
still share a single interpreter, since they all execute at the trust
level of a database superuser anyway.
Note: this change results in a functionality loss when libperl has been
built without the "multiplicity" option: it's no longer possible to call
plperl functions under different userids in one session, since such a
libperl can't support multiple interpreters in one process. However, such
a libperl already failed to support concurrent use of plperl and plperlu,
so it's likely that few people use such versions with Postgres.
Magnus Hagander [Thu, 16 Sep 2010 20:37:18 +0000 (20:37 +0000)]
Treat exit code 128 (ERROR_WAIT_NO_CHILDREN) as non-fatal on Win32,
since it can happen when a process fails to start when the system
is under high load.
Per several bug reports and many peoples investigation.
Back-patch to 8.2, since testing shows no issues even though the
"deadman-switch" does not exist in this version.
Tom Lane [Sat, 25 Sep 2010 19:57:05 +0000 (15:57 -0400)]
Further fixes to the pg_get_expr() security fix in back branches.
It now emerges that the JDBC driver expects to be able to use pg_get_expr()
on an output of a sub-SELECT. So extend the check logic to be able to recurse
into a sub-SELECT to see if the argument is ultimately coming from an
appropriate column. Per report from Thomas Kellerer.
Tom Lane [Thu, 23 Sep 2010 20:53:37 +0000 (16:53 -0400)]
Prevent show_session_authorization from crashing when session_authorization
hasn't been set.
The only known case where this can happen is when show_session_authorization
is invoked in an autovacuum process, which is possible if an index function
calls it, as for example in bug #5669 from Andrew Geery. We could perhaps
try to return a sensible value, such as the name of the cluster-owning
superuser; but that seems like much more trouble than the case is worth,
and in any case it could create new possible failure modes. Simply
returning an empty string seems like the most appropriate fix.
Back-patch to all supported versions, even those before autovacuum, just
in case there's another way to provoke this crash.
Tom Lane [Thu, 23 Sep 2010 23:34:56 +0000 (19:34 -0400)]
Avoid sharing subpath list structure when flattening nested AppendRels.
In some situations the original coding led to corrupting the child AppendRel's
subpaths list, effectively adding other members of the parent's list to it.
This was usually masked because we never made any further use of the child's
list, but given the right combination of circumstances, we could do so. The
visible symptom would be a relation getting scanned twice, as in bug #5673
from David Schmitt.
Backpatch to 8.2, which is as far back as the risky coding appears. The
example submitted by David only fails in 8.4 and later, but I'm not convinced
that there aren't any even-more-obscure cases where 8.2 and 8.3 would fail.
Tom Lane [Thu, 23 Sep 2010 02:32:46 +0000 (22:32 -0400)]
More fixes for libpq's .gitignore file.
The previous patches failed to cover a lot of symlinks that are only
added in platform-specific cases. Make the lists match what's in the
Makefile for each branch.
Tom Lane [Thu, 2 Sep 2010 03:17:13 +0000 (03:17 +0000)]
Fix up flushing of composite-type typcache entries to be driven directly by
SI invalidation events, rather than indirectly through the relcache.
In the previous coding, we had to flush a composite-type typcache entry
whenever we discarded the corresponding relcache entry. This caused problems
at least when testing with RELCACHE_FORCE_RELEASE, as shown in recent report
from Jeff Davis, and might result in real-world problems given the kind of
unexpected relcache flush that that test mechanism is intended to model.
The new coding decouples relcache and typcache management, which is a good
thing anyway from a structural perspective. The cost is that we have to
search the typcache linearly to find entries that need to be flushed. There
are a couple of ways we could avoid that, but at the moment it's not clear
it's worth any extra trouble, because the typcache contains very few entries
in typical operation.
Back-patch to 8.2, the same as some other recent fixes in this general area.
The patch could be carried back to 8.0 with some additional work, but given
that it's only hypothetical whether we're fixing any problem observable in
the field, it doesn't seem worth the work now.
Tom Lane [Mon, 30 Aug 2010 19:51:46 +0000 (19:51 +0000)]
Back-port into 8.2 an old fix to ensure that BYTE_ORDER gets set
correctly on 64-bit Intel Solaris. Per my proposal yesterday,
8.2 is where we will start considering this platform supported.
While this patch itself could easily go into older branches,
there's not a huge amount of point unless we also make some
significantly-more-invasive changes in the spinlock support.
Tom Lane [Sun, 29 Aug 2010 19:33:43 +0000 (19:33 +0000)]
Reduce PANIC to ERROR in some occasionally-reported btree failure cases.
This patch changes _bt_split() and _bt_pagedel() to throw a plain ERROR,
rather than PANIC, for several cases that are reported from the field
from time to time:
* right sibling's left-link doesn't match;
* PageAddItem failure during _bt_split();
* parent page's next child isn't right sibling during _bt_pagedel().
In addition the error messages for these cases have been made a bit
more verbose, with additional values included.
The original motivation for PANIC here was to capture core dumps for
subsequent analysis. But with so many users whose platforms don't capture
core dumps by default, or who are unprepared to analyze them anyway, it's hard
to justify a forced database restart when we can fairly easily detect the
problems before we've reached the critical sections where PANIC would be
necessary. It is not currently known whether the reports of these messages
indicate well-hidden bugs in Postgres, or are a result of storage-level
malfeasance; the latter possibility suggests that we ought to try to be more
robust even if there is a bug here that's ultimately found.
Backpatch to 8.2. The code before that is sufficiently different that
it doesn't seem worth the trouble to back-port further.
Tom Lane [Thu, 26 Aug 2010 19:59:15 +0000 (19:59 +0000)]
Update time zone data files to tzdata release 2010l: DST law changes in
Egypt and Palestine. Added new names for two Micronesian timezones:
Pacific/Chuuk is now preferred over Pacific/Truk (and the preferred
abbreviation is CHUT not TRUT) and Pacific/Pohnpei is preferred over
Pacific/Ponape. Historical corrections for Finland.
Tom Lane [Thu, 26 Aug 2010 18:55:06 +0000 (18:55 +0000)]
Fix ExecMakeTableFunctionResult to verify that all rows returned by a SRF
returning "record" actually do have the same rowtype. This is needed because
the parser can't realistically enforce that they will all have the same typmod,
as seen in a recent example from David Wheeler.
Back-patch to 8.0, which is as far back as we have the notion of RECORD
subtypes being distinguished by typmod. Wheeler's example depends on
8.4-and-up features, but I suspect there may be ways to provoke similar
failures before 8.4.
Peter Eisentraut [Wed, 25 Aug 2010 19:37:39 +0000 (19:37 +0000)]
Catch null pointer returns from PyCObject_AsVoidPtr and PyCObject_FromVoidPtr
This is reproducibly possible in Python 2.7 if the user turned
PendingDeprecationWarning into an error, but it's theoretically also possible
in earlier versions in case of exceptional conditions.
Tom Lane [Mon, 16 Aug 2010 17:33:12 +0000 (17:33 +0000)]
Arrange to fsync the contents of lockfiles (both postmaster.pid and the
socket lockfile) when writing them. The lack of an fsync here may well
explain two different reports we've seen of corrupted lockfile contents,
which doesn't particularly bother the running server but can prevent a
new server from starting if the old one crashes. Per suggestion from
Alvaro.