Tom Lane [Fri, 1 Jun 2007 23:43:17 +0000 (23:43 +0000)]
Fix aboriginal bug in BufFileDumpBuffer that would cause it to write the
wrong data when dumping a bufferload that crosses a component-file boundary.
This probably has not been seen in the wild because (a) component files are
normally 1GB apiece and (b) non-block-aligned buffer usage is relatively
rare. But it's fairly easy to reproduce a problem if one reduces RELSEG_SIZE
in a test build. Kudos to Kurt Harriman for spotting the bug.
Tom Lane [Fri, 1 Jun 2007 15:58:02 +0000 (15:58 +0000)]
Fix performance problems in multi-batch hash joins by ensuring that we select
a well-randomized batch number even when given a poorly-randomized hash value.
This is a bit inefficient but seems the only practical solution given the
constraint that we can't change the hash functions in released branches.
Per report from Joseph Shraibman.
Applied to 8.1 and 8.2 only --- HEAD is getting a cleaner fix, and 8.0 and
before use different coding that seems less vulnerable.
Tom Lane [Wed, 30 May 2007 21:01:45 +0000 (21:01 +0000)]
Fix overly-strict sanity check in BeginInternalSubTransaction that made it
fail when used in a deferred trigger. Bug goes back to 8.0; no doubt the
reason it hadn't been noticed is that we've been discouraging use of
user-defined constraint triggers. Per report from Frank van Vugt.
Neil Conway [Tue, 29 May 2007 04:59:15 +0000 (04:59 +0000)]
Fix a bug in input processing for the "interval" type. Previously,
"microsecond" and "millisecond" units were not considered valid input
by themselves, which caused inputs like "1 millisecond" to be rejected
erroneously.
Update the docs, add regression tests, and backport to 8.2 and 8.1
Tom Lane [Tue, 22 May 2007 23:24:09 +0000 (23:24 +0000)]
Repair planner bug introduced in 8.2 by ability to rearrange outer joins:
in cases where a sub-SELECT inserts a WHERE clause between two outer joins,
that clause may prevent us from re-ordering the two outer joins. The code
was considering only the joins' own ON-conditions in determining reordering
safety, which is not good enough. Add a "delay_upper_joins" flag to
OuterJoinInfo to flag that we have detected such a clause and higher-level
outer joins shouldn't be permitted to commute with this one. (This might
seem overly coarse, but given the current rules for OJ reordering, it's
sufficient AFAICT.)
The failure case is actually pretty narrow: it needs a WHERE clause within
the RHS of a left join that checks the RHS of a lower left join, but is not
strict for that RHS (else we'd have simplified the lower join to a plain
join). Even then no failure will be manifest unless the planner chooses to
rearrange the join order.
Tom Lane [Tue, 22 May 2007 01:40:42 +0000 (01:40 +0000)]
Fix best_inner_indexscan to return both the cheapest-total-cost and
cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider
both of these (when different) as the inside of a nestloop join. The
original design was based on the assumption that indexscan paths always
have negligible startup cost, and so total cost is the only important
figure of merit; an assumption that's obviously broken by bitmap
indexscans. This oversight could lead to choosing poor plans in cases
where fast-start behavior is more important than total cost, such as
LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from
Chuck D.
Tom Lane [Fri, 18 May 2007 01:20:25 +0000 (01:20 +0000)]
Remove redundant logging of send failures when SSL is in use. While pqcomm.c
had been taught not to do that ages ago, the SSL code was helpfully bleating
anyway. Resolves some recent reports such as bug #3266; however the
underlying cause of the related bug #2829 is still unclear.
Tom Lane [Thu, 17 May 2007 23:31:59 +0000 (23:31 +0000)]
Temporary fix for the problem that pg_stat_activity, inet_client_addr(),
and inet_server_addr() fail if the client connected over a "scoped" IPv6
address. In this case getnameinfo() will return a string ending with
a poorly-standardized "%something" zone specifier, which these functions
try to feed to network_in(), which won't take it. So that we don't lose
functionality altogether, suppress the zone specifier before giving the
string to network_in(). Per report from Brian Hirt.
TODO: probably someday the inet type should support scoped IPv6 addresses,
and then this patch should be reverted.
Alvaro Herrera [Tue, 15 May 2007 20:20:24 +0000 (20:20 +0000)]
Avoid emitting empty role names in the GRANTED BY clause of GRANT ROLE
when the grantor has been dropped. This is a workaround for the fact
that we don't track the grantor as a shared dependency.
Neil Conway [Tue, 15 May 2007 15:35:58 +0000 (15:35 +0000)]
Add a note to the documentation to clarify that even when
"autovacuum = off", the system may still periodically start autovacuum
processes to prevent XID wraparound. Patch from David Fetter, with
editorializing.
Tom Lane [Sat, 12 May 2007 19:22:43 +0000 (19:22 +0000)]
Improve predicate_refuted_by_simple_clause() to handle IS NULL and IS NOT NULL
more completely. The motivation for having it understand IS NULL at all was
to allow use of "foo IS NULL" as one of the subsets of a partitioning on
"foo", but as reported by Aleksander Kmetec, it wasn't really getting the job
done. Backpatch to 8.2 since this is arguably a performance bug.
Tom Lane [Fri, 11 May 2007 20:18:21 +0000 (20:18 +0000)]
Fix my oversight in enabling domains-of-domains: ALTER DOMAIN ADD CONSTRAINT
needs to check the new constraint against columns of derived domains too.
Also, make it error out if the domain to be modified is used within any
composite-type columns. Eventually we should support that case, but it seems
a bit painful, and not suitable for a back-patch. For the moment just let the
user know we can't do it.
Backpatch to 8.2, which is the only released version that allows nested
domains. Possibly the other part should be back-patched further.
Magnus Hagander [Sat, 5 May 2007 17:05:55 +0000 (17:05 +0000)]
Check return code from strxfrm on Windows since it has a
non-standard way of indicating errors, so we don't try to
allocate INT_MAX bytes to store a result in.
Tom Lane [Tue, 1 May 2007 18:54:02 +0000 (18:54 +0000)]
Fix a thinko in my patch of a couple months ago for bug #3116: it did the
wrong thing when inlining polymorphic SQL functions, because it was using the
function's declared return type where it should have used the actual result
type of the current call. In 8.1 and 8.2 this causes obvious failures even if
you don't have assertions turned on; in 8.0 and 7.4 it would only be a problem
if the inlined expression were used as an input to a function that did
run-time type determination on its inputs. Add a regression test, since this
is evidently an under-tested area.
Tom Lane [Thu, 26 Apr 2007 23:24:57 +0000 (23:24 +0000)]
Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
Tom Lane [Fri, 20 Apr 2007 02:37:49 +0000 (02:37 +0000)]
Support explicit placement of the temporary-table schema within search_path.
This is needed to allow a security-definer function to set a truly secure
value of search_path. Without it, a malicious user can use temporary objects
to execute code with the privileges of the security-definer function. Even
pushing the temp schema to the back of the search path is not quite good
enough, because a function or operator at the back of the path might still
capture control from one nearer the front due to having a more exact datatype
match. Hence, disable searching the temp schema altogether for functions and
operators.
Tom Lane [Thu, 19 Apr 2007 20:24:10 +0000 (20:24 +0000)]
Repair PANIC condition in hash indexes when a previous index extension attempt
failed (due to lock conflicts or out-of-space). We might have already
extended the index's filesystem EOF before failing, causing the EOF to be
beyond what the metapage says is the last used page. Hence the invariant
maintained by the code needs to be "EOF is at or beyond last used page",
not "EOF is exactly the last used page". Problem was created by my patch
of 2006-11-19 that attempted to repair bug #2737. Since that was
back-patched to 7.4, this needs to be as well. Per report and test case
from Vlastimil Krejcir.
Tom Lane [Thu, 19 Apr 2007 16:33:32 +0000 (16:33 +0000)]
Fix plpgsql to avoid reference to already-freed memory when returning a
pass-by-reference data type and the RETURN statement is within an EXCEPTION
block. Bug introduced by my fix of 2007-01-28 to use per-subtransaction
ExprContexts/EStates; since that wasn't back-patched into older branches,
only 8.2 and HEAD are affected. Per report from Gary Winslow.
Bruce Momjian [Wed, 18 Apr 2007 00:18:31 +0000 (00:18 +0000)]
Document that the COPY delimiter must be an ASCII byte, rather than a
multi-byte value. It can also be a single-byte encoded character if
the client and server versions match.
Tom Lane [Tue, 17 Apr 2007 20:03:10 +0000 (20:03 +0000)]
Rewrite choose_bitmap_and() to make it more robust in the presence of
competing alternatives for indexes to use in a bitmap scan. The former
coding took estimated selectivity as an overriding factor, causing it to
sometimes choose indexes that were much slower to scan than ones with a
slightly worse selectivity. It was also too narrow-minded about which
combinations of indexes to consider ANDing. The rewrite makes it pay more
attention to index scan cost than selectivity; this seems sane since it's
impossible to have very bad selectivity with low cost, whereas the reverse
isn't true. Also, we now consider each index alone, as well as adding
each index to an AND-group led by each prior index, for a total of about
O(N^2) rather than O(N) combinations considered. This makes the results
much less dependent on the exact order in which the indexes are
considered. It's still a lot cheaper than an O(2^N) exhaustive search.
A prefilter step eliminates all but the cheapest of those indexes using
the same set of WHERE conditions, to keep the effective value of N down in
scenarios where the DBA has created lots of partially-redundant indexes.
Tom Lane [Mon, 16 Apr 2007 18:42:17 +0000 (18:42 +0000)]
Fix pg_dump to not crash if -t or a similar switch is used to select a serial
sequence for dumping without also selecting its owning table. Make it not try
to emit ALTER SEQUENCE OWNED BY in this situation.
Per report from Michael Nolan.
Tom Lane [Thu, 12 Apr 2007 17:11:00 +0000 (17:11 +0000)]
Rearrange mdsync() looping logic to avoid the problem that a sufficiently
fast flow of new fsync requests can prevent mdsync() from ever completing.
This was an unforeseen consequence of a patch added in Mar 2006 to prevent
the fsync request queue from overflowing. Problem identified by Heikki
Linnakangas and independently by ITAGAKI Takahiro; fix based on ideas from
Takahiro-san, Heikki, and Tom.
Back-patch as far as 8.1 because a previous back-patch introduced the problem
into 8.1 ...
Tom Lane [Thu, 12 Apr 2007 15:04:41 +0000 (15:04 +0000)]
Cancel pending fsync requests during WAL replay of DROP DATABASE, per bug
report from David Darville. Back-patch as far as 8.1, which may or may not
have the problem but it seems a safe change anyway.
Tom Lane [Mon, 2 Apr 2007 18:49:36 +0000 (18:49 +0000)]
Fix check_sql_fn_retval to allow the case where a SQL function declared to
return void ends with a SELECT, if that SELECT has a single result that is
also of type void. Without this, it's hard to write a void function that
calls another void function. Per gripe from Peter.
Tom Lane [Fri, 30 Mar 2007 00:13:05 +0000 (00:13 +0000)]
Fix oversight in coding of _bt_start_vacuum: we can't assume that the LWLock
will be released by transaction abort before _bt_end_vacuum gets called.
If either of these "can't happen" errors actually happened, we'd freeze up
trying to acquire an already-held lock. Latest word is that this does
not explain Martin Pitt's trouble report, but it still looks like a bug.
Tom Lane [Sat, 17 Mar 2007 03:15:47 +0000 (03:15 +0000)]
SPI_cursor_open failed to enforce that only read-only queries could be
executed in read_only mode. This could lead to various relatively-subtle
failures, such as an allegedly stable function returning non-stable results.
Bug goes all the way back to the introduction of read-only mode in 8.0.
Per report from Gaetano Mendola.
Tom Lane [Wed, 14 Mar 2007 18:49:04 +0000 (18:49 +0000)]
Fix a longstanding bug in VACUUM FULL's handling of update chains. The code
did not expect that a DEAD tuple could follow a RECENTLY_DEAD tuple in an
update chain, but because the OldestXmin rule for determining deadness is a
simplification of reality, it is possible for this situation to occur
(implying that the RECENTLY_DEAD tuple is in fact dead to all observers,
but this patch does not attempt to exploit that). The code would follow a
chain forward all the way, but then stop before a DEAD tuple when backing
up, meaning that not all of the chain got moved. This could lead to copying
the chain multiple times (resulting in duplicate copies of the live tuple at
its end), or leaving dangling index entries behind (which, aside from
generating warnings from later vacuums, creates a risk of wrong query
results or bogus duplicate-key errors once the heap slot the index entry
points to is repopulated).
The fix is to recheck HeapTupleSatisfiesVacuum while following a chain
forward, and to stop if a DEAD tuple is reached. Each contiguous group
of RECENTLY_DEAD tuples will therefore be copied as a separate chain.
The patch also adds a couple of extra sanity checks to verify correct
behavior.
Tom Lane [Wed, 14 Mar 2007 17:38:15 +0000 (17:38 +0000)]
Arrange to install a "posixrules" entry in our timezone database, so that
POSIX-style timezone specs that don't exactly match any database entry will
be treated as having correct USA DST rules. Also, document that this can
be changed if you want to use some other DST rules with a POSIX zone spec.
We could consider changing localtime.c's TZDEFRULESTRING, but since that
facility can only deal with one DST transition rule, it seems fairly useless
now; might as well just plan to override it using a "posixrules" entry.
Backpatch as far as 8.0. There isn't much we can do in 7.x ... either your
libc gets it right, or it doesn't.
Alvaro Herrera [Sun, 11 Mar 2007 06:44:11 +0000 (06:44 +0000)]
Fix a race condition that caused pg_database_size() and pg_tablespace_size()
to fail if an object was removed between calls to ReadDir() and stat().
Per discussion in pgsql-hackers.
Magnus Hagander [Thu, 8 Mar 2007 19:27:48 +0000 (19:27 +0000)]
Remove unsafe calling of WSAStartup and WSACleanup from DllMain. Move the
inline cleanup call around so it will be called in the right order, and
be called on errors.
Tom Lane [Thu, 8 Mar 2007 17:03:43 +0000 (17:03 +0000)]
Fix vac_update_relstats to ensure it always sends a relcache inval message,
even if none of the fields in the pg_class row change. This behavior is
necessary to ensure other backends flush rd_targblock values that might
point to truncated-away pages. We got this right pre-8.2 but it was broken
by overoptimistic change to not write out the pg_class row if unchanged.
Per report from Pavan Deolasee.
Teodor Sigaev [Wed, 7 Mar 2007 21:25:18 +0000 (21:25 +0000)]
Athough cube is a varlena type, nowhere was a detoasting of cube's value, so
fix it. Add macroses DatumGetNDBOX, PG_GETARG_NDBOX and PG_RETURN_NDBOX.
Backpatch for 8.2 too.
Previous versions use version 0 calling conventions. And fmgr code detoast
values for user-defined functions.
Tom Lane [Tue, 6 Mar 2007 22:45:23 +0000 (22:45 +0000)]
Fix oversight in original coding of inline_function(): since
check_sql_fn_retval allows binary-compatibility cases, the expression
extracted from an inline-able SQL function might have a type that is only
binary-compatible with the declared function result type. To avoid possibly
changing the semantics of the expression, we should insert a RelabelType node
in such cases. This has only been shown to have bad consequences in recent
8.1 and up releases, but I suspect there may be failure cases in the older
branches too, so patch it all the way back. Per bug #3116 from Greg Mullane.
Along the way, fix an omission in eval_const_expressions_mutator: it failed
to copy the relabelformat field when processing a RelabelType. No known
observable failures from this, but it definitely isn't intended behavior.
Tom Lane [Thu, 1 Mar 2007 18:50:36 +0000 (18:50 +0000)]
Fix markQueryForLocking() to work correctly in the presence of nested views.
It has been wrong for this case since it was first written for 7.1 :-(
Per report from Pavel HanĂ¡k.
Tom Lane [Sun, 18 Feb 2007 19:49:30 +0000 (19:49 +0000)]
Fix portal management code to support non-default command completion tags for
portals using PORTAL_UTIL_SELECT strategy. This is currently significant only
for FETCH queries, which are supposed to include a count in the tag. Seems
it's been broken since 7.4, but nobody noticed before Knut Lehre.
Tom Lane [Fri, 16 Feb 2007 20:57:26 +0000 (20:57 +0000)]
Adjust the definition of is_pushed_down so that it's always true for INNER
JOIN quals, just like WHERE quals, even if they reference every one of the
join's relations. Now that we can reorder outer and inner joins, it's
possible for such a qual to end up being assigned to an outer join plan node,
and we mustn't have it treated as a join qual rather than a filter qual for
the node. (If it were, the join could produce null-extended rows that it
shouldn't.) Per bug report from Pelle Johansson.
Tom Lane [Fri, 16 Feb 2007 03:49:10 +0000 (03:49 +0000)]
Fix another problem in 8.2 changes that allowed "one-time" qual conditions to
be checked at plan levels below the top; namely, we have to allow for Result
nodes inserted just above a nestloop inner indexscan. Should think about
using the general Param mechanism to pass down outer-relation variables, but
for the moment we need a back-patchable solution. Per report from Phil Frost.
Tom Lane [Fri, 16 Feb 2007 00:14:08 +0000 (00:14 +0000)]
Restructure code that is responsible for ensuring that clauseless joins are
considered when it is necessary to do so because of a join-order restriction
(that is, an outer-join or IN-subselect construct). The former coding was a
bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario
Weilguni's recent bug report. His specific problem was that an IN could be
turned into a "clauseless" join due to constant-propagation removing the IN's
joinclause, and if the IN's subselect involved more than one relation and
there was more than one such IN linking to the same upper relation, then the
only valid join orders involve "bushy" plans but we would fail to consider the
specific paths needed to get there. (See the example case added to the join
regression test.) On examining the code I wonder if there weren't some other
problem cases too; in particular it seems that GEQO was defending against a
different set of corner cases than the main planner was. There was also an
efficiency problem, in that when we did realize we needed a clauseless join
because of an IN, we'd consider clauseless joins against every other relation
whether this was sensible or not. It seems a better design is to use the
outer-join and in-clause lists as a backup heuristic, just as the rule of
joining only where there are joinclauses is a heuristic: we'll join two
relations if they have a usable joinclause *or* this might be necessary to
satisfy an outer-join or IN-clause join order restriction. I refactored the
code to have just one place considering this instead of three, and made sure
that it covered all the cases that any of them had been considering.
Backpatch as far as 8.1 (which has only the IN-clause form of the disease).
By rights 8.0 and 7.4 should have the bug too, but they accidentally fail
to fail, because the joininfo structure used in those releases preserves some
memory of there having once been a joinclause between the inner and outer
sides of an IN, and so it leads the code in the right direction anyway.
I'll be conservative and not touch them.
Tom Lane [Thu, 15 Feb 2007 03:07:21 +0000 (03:07 +0000)]
Repair oversight in 8.2 change that improved the handling of "pseudoconstant"
WHERE clauses. createplan.c is now willing to stick a gating Result node
almost anywhere in the plan tree, and in particular one can wind up directly
underneath a MergeJoin node. This means it had better be willing to handle
Mark/Restore. Fortunately, that's trivial in such cases, since we can just
pass off the call to the input node (which the planner has previously ensured
can handle Mark/Restore). Per report from Phil Frost.
Tom Lane [Tue, 13 Feb 2007 19:39:48 +0000 (19:39 +0000)]
Disallow committing a prepared transaction unless we are in the same database
it was executed in. Someday it might be nice to allow cross-DB commits, but
work would be needed in NOTIFY and perhaps other places. Per Heikki.
Tom Lane [Tue, 13 Feb 2007 02:31:12 +0000 (02:31 +0000)]
Repair bug in 8.2's new logic for planning outer joins: we have to allow joins
that overlap an outer join's min_righthand but aren't fully contained in it,
to support joining within the RHS after having performed an outer join that
can commute with this one. Aside from the direct fix in make_join_rel(),
fix has_join_restriction() and GEQO's desirable_join() to consider this
possibility. Per report from Ian Harding.
Tom Lane [Thu, 8 Feb 2007 18:37:43 +0000 (18:37 +0000)]
Fix an ancient logic error in plpgsql's exec_stmt_block: it thought it could
get away with not (re)initializing a local variable if the variable is marked
"isconst" and not "isnull". Unfortunately it makes this decision after having
already freed the old value, meaning that something like
for i in 1..10 loop
declare c constant text := 'hi there';
leads to subsequent accesses to freed memory, and hence probably crashes.
(In particular, this is why Asif Ali Rehman's bug leads to crash and not
just an unexpectedly-NULL value for SQLERRM: SQLERRM is marked CONSTANT
and so triggers this error.)
The whole thing seems wrong on its face anyway: CONSTANT means that you can't
change the variable inside the block, not that the initializer expression is
guaranteed not to change value across successive block entries. Hence,
remove the "optimization" instead of trying to fix it.
Tom Lane [Thu, 8 Feb 2007 18:37:38 +0000 (18:37 +0000)]
Rearrange use of plpgsql_add_initdatums() so that only the parsing of a
DECLARE section needs to know about it. Formerly, everyplace besides DECLARE
that created variables needed to do "plpgsql_add_initdatums(NULL)" to prevent
those variables from being sucked up as part of a subsequent DECLARE block.
This is obviously error-prone, and in fact the SQLSTATE/SQLERRM patch had
failed to do it for those two variables, leading to the bug recently exhibited
by Asif Ali Rehman: a DECLARE within an exception handler tried to reinitialize
SQLERRM.
Although the SQLSTATE/SQLERRM patch isn't in any pre-8.1 branches, and so
I can't point to a demonstrable failure there, it seems wise to back-patch
this into the older branches anyway, just to keep the logic similar to HEAD.