Magnus Hagander [Thu, 12 Jul 2007 14:13:06 +0000 (14:13 +0000)]
Fix freenig of names in Kerberos when using MIT - need to use the
free function provided in the Kerberos library.
This fixes a very hard to track down heap corruption on windows
when using debug runtimes.
Joe Conway [Mon, 9 Jul 2007 01:32:30 +0000 (01:32 +0000)]
Restrict non-superusers to password authenticated connections
to prevent possible escalation of privilege. Provide new SECURITY
DEFINER functions with old behavior, but initially REVOKE ALL
from public for these functions. Per list discussion and design
proposed by Tom Lane.
Tom Lane [Sun, 8 Jul 2007 22:23:25 +0000 (22:23 +0000)]
Remove the pgstat_drop_relation() call from smgr_internal_unlink(), because
we don't know at that point which relation OID to tell pgstat to forget.
The code was passing the relfilenode, which is incorrect, and could possibly
cause some other relation's stats to be zeroed out. While we could try to
clean this up, it seems much simpler and more reliable to let the next
invocation of pgstat_vacuum_tabstat() fix things; which indeed is how it
worked before I introduced the buggy code into 8.1.3 and later :-(.
Problem noticed by Itagaki Takahiro, fix is per subsequent discussion.
Magnus Hagander [Mon, 2 Jul 2007 21:58:38 +0000 (21:58 +0000)]
- Fix the -w (wait) option to work in Windows service mode, per bug #3382.
- Prevent the -w option being passed to the postmaster.
- Read the postmaster options file when starting as a Windows service.
Tom Lane [Mon, 2 Jul 2007 20:12:00 +0000 (20:12 +0000)]
Fix failure to restart Postgres when Linux kernel returns EIDRM for shmctl().
This is a Linux kernel bug that apparently exists in every extant kernel
version: sometimes shmctl() will fail with EIDRM when EINVAL is correct.
We were assuming that EIDRM indicates a possible conflict with pre-existing
backends, and refusing to start the postmaster when this happens. Fortunately,
there does not seem to be any case where Linux can legitimately return EIDRM
(it doesn't track shmem segments in a way that would allow that), so we can
get away with just assuming that EIDRM means EINVAL on this platform.
Per reports from Michael Fuhr and Jon Lapham --- it's a bit surprising
we have not seen more reports, actually.
Tom Lane [Sun, 1 Jul 2007 17:45:49 +0000 (17:45 +0000)]
Avoid memory leakage when a series of subtransactions invoke AFTER triggers
that are fired at end-of-statement (as is the normal case for foreign keys,
for example). In this situation the per-subxact deferred trigger context
is always empty when subtransaction exit is reached; so we could free it,
but were not doing so, leading to an intratransaction leak of 8K or more
per subtransaction. Per off-list example from Viatcheslav Kalinin
subsequent to bug #3418 (his original bug report omitted a foreign key
constraint needed to cause this leak).
Back-patch to 8.2; prior versions were not using per-subxact contexts
for deferred triggers, so did not have this leak.
Tom Lane [Fri, 29 Jun 2007 16:18:52 +0000 (16:18 +0000)]
Fix computation of PG_VERSION_NUM by configure: remove unnecessary and
unportable backslashes in awk script (per Patrick Welche), and add
brackets to prevent autoconf from mangling sed's regexp (the sed call
here never did what was expected).
Tom Lane [Fri, 29 Jun 2007 01:51:49 +0000 (01:51 +0000)]
Fix a passel of ancient bugs in to_char(), including two distinct buffer
overruns (neither of which seem likely to be exploitable as security holes,
fortunately, since the provoker can't control the data written). One of
these is due to choosing to stomp on the output of a called function, which
is bad news in any case; make it treat the called functions' results as
read-only. Avoid some unnecessary palloc/pfree traffic too; it's not
really helpful to free small temporary objects, and again this is presuming
more than it ought to about the nature of the results of called functions.
Per report from Patrick Welche and additional code-reading by Imad.
Tom Lane [Thu, 28 Jun 2007 17:50:12 +0000 (17:50 +0000)]
Fix incorrect tests for undef Perl values in some places in plperl.c.
The correct test for defined-ness is SvOK(sv), not anything involving
SvTYPE. Per bug #3415 from Matt Taylor.
Back-patch as far as 8.0; no apparent problem in 7.x.
Neil Conway [Fri, 22 Jun 2007 03:19:57 +0000 (03:19 +0000)]
In psql, when running a SELECT query using a cursor, flush the query
output after each FETCH. This ensures that incremental results are
available to clients that are executing long-running SELECT queries
via the FETCH_COUNT feature.
Tom Lane [Wed, 20 Jun 2007 18:21:08 +0000 (18:21 +0000)]
transformColumnDefinition failed to complain about
create table foo (bar int default null default 3);
due to not thinking about the special-case handling of DEFAULT NULL.
Problem noticed while investigating bug #3396.
Tom Lane [Wed, 20 Jun 2007 18:15:57 +0000 (18:15 +0000)]
CREATE DOMAIN ... DEFAULT NULL failed because gram.y special-cases DEFAULT
NULL and DefineDomain didn't. Bug goes all the way back to original coding
of domains. Per bug #3396 from Sergey Burladyan.
Andrew Dunstan [Thu, 14 Jun 2007 01:49:39 +0000 (01:49 +0000)]
Implement a chunking protocol for writes to the syslogger pipe, with messages
reassembled in the syslogger before writing to the log file. This prevents
partial messages from being written, which mucks up log rotation, and
messages from different backends being interleaved, which causes garbled
logs. Backport as far as 8.0, where the syslogger was introduced.
Tom Lane [Tue, 12 Jun 2007 15:58:39 +0000 (15:58 +0000)]
Fix DecodeDateTime to allow timezone to appear before year. This had
historically worked in some but not all cases, but as of 8.2 it failed for all
timezone formats. Fix, and add regression test cases to catch future
regressions in this area. Per gripe from Adam Witney.
Tom Lane [Sat, 9 Jun 2007 15:52:38 +0000 (15:52 +0000)]
Allow numeric_fac() to be interrupted, since it can take quite a while for
large inputs. Also cause it to error out immediately if the result will
overflow, instead of grinding through a lot of calculation first.
Per gripe from Jim Nasby.
Teodor Sigaev [Mon, 4 Jun 2007 15:59:20 +0000 (15:59 +0000)]
Fix bundle bugs of GIN:
- Fix possible deadlock between UPDATE and VACUUM queries. Bug never was
observed in 8.2, but it still exist there. HEAD is more sensitive to
bug after recent "ring" of buffer improvements.
- Fix WAL creation: if parent page is stored as is after split then
incomplete split isn't removed during replay. This happens rather rare, only
on large tables with a lot of updates/inserts.
- Fix WAL replay: there was wrong test of XLR_BKP_BLOCK_* for left
page after deletion of page. That causes wrong rightlink field: it pointed
to deleted page.
- add checking of match of clearing incomplete split
- cleanup incomplete split list after proceeding
All of this chages doesn't change on-disk storage, so backpatch...
But second point may be an issue for replaying logs from previous version.
Magnus Hagander [Mon, 4 Jun 2007 13:39:41 +0000 (13:39 +0000)]
On win32, retry reading when WSARecv returns WSAEWOULDBLOCK. There seem
to be cases when at least Windows 2000 can do this even though select
just indicated that the socket is readable.
Tom Lane [Fri, 1 Jun 2007 23:43:17 +0000 (23:43 +0000)]
Fix aboriginal bug in BufFileDumpBuffer that would cause it to write the
wrong data when dumping a bufferload that crosses a component-file boundary.
This probably has not been seen in the wild because (a) component files are
normally 1GB apiece and (b) non-block-aligned buffer usage is relatively
rare. But it's fairly easy to reproduce a problem if one reduces RELSEG_SIZE
in a test build. Kudos to Kurt Harriman for spotting the bug.
Tom Lane [Fri, 1 Jun 2007 15:58:02 +0000 (15:58 +0000)]
Fix performance problems in multi-batch hash joins by ensuring that we select
a well-randomized batch number even when given a poorly-randomized hash value.
This is a bit inefficient but seems the only practical solution given the
constraint that we can't change the hash functions in released branches.
Per report from Joseph Shraibman.
Applied to 8.1 and 8.2 only --- HEAD is getting a cleaner fix, and 8.0 and
before use different coding that seems less vulnerable.
Tom Lane [Wed, 30 May 2007 21:01:45 +0000 (21:01 +0000)]
Fix overly-strict sanity check in BeginInternalSubTransaction that made it
fail when used in a deferred trigger. Bug goes back to 8.0; no doubt the
reason it hadn't been noticed is that we've been discouraging use of
user-defined constraint triggers. Per report from Frank van Vugt.
Neil Conway [Tue, 29 May 2007 04:59:15 +0000 (04:59 +0000)]
Fix a bug in input processing for the "interval" type. Previously,
"microsecond" and "millisecond" units were not considered valid input
by themselves, which caused inputs like "1 millisecond" to be rejected
erroneously.
Update the docs, add regression tests, and backport to 8.2 and 8.1
Tom Lane [Tue, 22 May 2007 23:24:09 +0000 (23:24 +0000)]
Repair planner bug introduced in 8.2 by ability to rearrange outer joins:
in cases where a sub-SELECT inserts a WHERE clause between two outer joins,
that clause may prevent us from re-ordering the two outer joins. The code
was considering only the joins' own ON-conditions in determining reordering
safety, which is not good enough. Add a "delay_upper_joins" flag to
OuterJoinInfo to flag that we have detected such a clause and higher-level
outer joins shouldn't be permitted to commute with this one. (This might
seem overly coarse, but given the current rules for OJ reordering, it's
sufficient AFAICT.)
The failure case is actually pretty narrow: it needs a WHERE clause within
the RHS of a left join that checks the RHS of a lower left join, but is not
strict for that RHS (else we'd have simplified the lower join to a plain
join). Even then no failure will be manifest unless the planner chooses to
rearrange the join order.
Tom Lane [Tue, 22 May 2007 01:40:42 +0000 (01:40 +0000)]
Fix best_inner_indexscan to return both the cheapest-total-cost and
cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider
both of these (when different) as the inside of a nestloop join. The
original design was based on the assumption that indexscan paths always
have negligible startup cost, and so total cost is the only important
figure of merit; an assumption that's obviously broken by bitmap
indexscans. This oversight could lead to choosing poor plans in cases
where fast-start behavior is more important than total cost, such as
LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from
Chuck D.
Tom Lane [Fri, 18 May 2007 01:20:25 +0000 (01:20 +0000)]
Remove redundant logging of send failures when SSL is in use. While pqcomm.c
had been taught not to do that ages ago, the SSL code was helpfully bleating
anyway. Resolves some recent reports such as bug #3266; however the
underlying cause of the related bug #2829 is still unclear.
Tom Lane [Thu, 17 May 2007 23:31:59 +0000 (23:31 +0000)]
Temporary fix for the problem that pg_stat_activity, inet_client_addr(),
and inet_server_addr() fail if the client connected over a "scoped" IPv6
address. In this case getnameinfo() will return a string ending with
a poorly-standardized "%something" zone specifier, which these functions
try to feed to network_in(), which won't take it. So that we don't lose
functionality altogether, suppress the zone specifier before giving the
string to network_in(). Per report from Brian Hirt.
TODO: probably someday the inet type should support scoped IPv6 addresses,
and then this patch should be reverted.
Alvaro Herrera [Tue, 15 May 2007 20:20:24 +0000 (20:20 +0000)]
Avoid emitting empty role names in the GRANTED BY clause of GRANT ROLE
when the grantor has been dropped. This is a workaround for the fact
that we don't track the grantor as a shared dependency.
Neil Conway [Tue, 15 May 2007 15:35:58 +0000 (15:35 +0000)]
Add a note to the documentation to clarify that even when
"autovacuum = off", the system may still periodically start autovacuum
processes to prevent XID wraparound. Patch from David Fetter, with
editorializing.
Tom Lane [Sat, 12 May 2007 19:22:43 +0000 (19:22 +0000)]
Improve predicate_refuted_by_simple_clause() to handle IS NULL and IS NOT NULL
more completely. The motivation for having it understand IS NULL at all was
to allow use of "foo IS NULL" as one of the subsets of a partitioning on
"foo", but as reported by Aleksander Kmetec, it wasn't really getting the job
done. Backpatch to 8.2 since this is arguably a performance bug.
Tom Lane [Fri, 11 May 2007 20:18:21 +0000 (20:18 +0000)]
Fix my oversight in enabling domains-of-domains: ALTER DOMAIN ADD CONSTRAINT
needs to check the new constraint against columns of derived domains too.
Also, make it error out if the domain to be modified is used within any
composite-type columns. Eventually we should support that case, but it seems
a bit painful, and not suitable for a back-patch. For the moment just let the
user know we can't do it.
Backpatch to 8.2, which is the only released version that allows nested
domains. Possibly the other part should be back-patched further.
Magnus Hagander [Sat, 5 May 2007 17:05:55 +0000 (17:05 +0000)]
Check return code from strxfrm on Windows since it has a
non-standard way of indicating errors, so we don't try to
allocate INT_MAX bytes to store a result in.
Tom Lane [Tue, 1 May 2007 18:54:02 +0000 (18:54 +0000)]
Fix a thinko in my patch of a couple months ago for bug #3116: it did the
wrong thing when inlining polymorphic SQL functions, because it was using the
function's declared return type where it should have used the actual result
type of the current call. In 8.1 and 8.2 this causes obvious failures even if
you don't have assertions turned on; in 8.0 and 7.4 it would only be a problem
if the inlined expression were used as an input to a function that did
run-time type determination on its inputs. Add a regression test, since this
is evidently an under-tested area.
Tom Lane [Thu, 26 Apr 2007 23:24:57 +0000 (23:24 +0000)]
Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
Tom Lane [Fri, 20 Apr 2007 02:37:49 +0000 (02:37 +0000)]
Support explicit placement of the temporary-table schema within search_path.
This is needed to allow a security-definer function to set a truly secure
value of search_path. Without it, a malicious user can use temporary objects
to execute code with the privileges of the security-definer function. Even
pushing the temp schema to the back of the search path is not quite good
enough, because a function or operator at the back of the path might still
capture control from one nearer the front due to having a more exact datatype
match. Hence, disable searching the temp schema altogether for functions and
operators.
Tom Lane [Thu, 19 Apr 2007 20:24:10 +0000 (20:24 +0000)]
Repair PANIC condition in hash indexes when a previous index extension attempt
failed (due to lock conflicts or out-of-space). We might have already
extended the index's filesystem EOF before failing, causing the EOF to be
beyond what the metapage says is the last used page. Hence the invariant
maintained by the code needs to be "EOF is at or beyond last used page",
not "EOF is exactly the last used page". Problem was created by my patch
of 2006-11-19 that attempted to repair bug #2737. Since that was
back-patched to 7.4, this needs to be as well. Per report and test case
from Vlastimil Krejcir.
Tom Lane [Thu, 19 Apr 2007 16:33:32 +0000 (16:33 +0000)]
Fix plpgsql to avoid reference to already-freed memory when returning a
pass-by-reference data type and the RETURN statement is within an EXCEPTION
block. Bug introduced by my fix of 2007-01-28 to use per-subtransaction
ExprContexts/EStates; since that wasn't back-patched into older branches,
only 8.2 and HEAD are affected. Per report from Gary Winslow.
Bruce Momjian [Wed, 18 Apr 2007 00:18:31 +0000 (00:18 +0000)]
Document that the COPY delimiter must be an ASCII byte, rather than a
multi-byte value. It can also be a single-byte encoded character if
the client and server versions match.
Tom Lane [Tue, 17 Apr 2007 20:03:10 +0000 (20:03 +0000)]
Rewrite choose_bitmap_and() to make it more robust in the presence of
competing alternatives for indexes to use in a bitmap scan. The former
coding took estimated selectivity as an overriding factor, causing it to
sometimes choose indexes that were much slower to scan than ones with a
slightly worse selectivity. It was also too narrow-minded about which
combinations of indexes to consider ANDing. The rewrite makes it pay more
attention to index scan cost than selectivity; this seems sane since it's
impossible to have very bad selectivity with low cost, whereas the reverse
isn't true. Also, we now consider each index alone, as well as adding
each index to an AND-group led by each prior index, for a total of about
O(N^2) rather than O(N) combinations considered. This makes the results
much less dependent on the exact order in which the indexes are
considered. It's still a lot cheaper than an O(2^N) exhaustive search.
A prefilter step eliminates all but the cheapest of those indexes using
the same set of WHERE conditions, to keep the effective value of N down in
scenarios where the DBA has created lots of partially-redundant indexes.
Tom Lane [Mon, 16 Apr 2007 18:42:17 +0000 (18:42 +0000)]
Fix pg_dump to not crash if -t or a similar switch is used to select a serial
sequence for dumping without also selecting its owning table. Make it not try
to emit ALTER SEQUENCE OWNED BY in this situation.
Per report from Michael Nolan.
Tom Lane [Thu, 12 Apr 2007 17:11:00 +0000 (17:11 +0000)]
Rearrange mdsync() looping logic to avoid the problem that a sufficiently
fast flow of new fsync requests can prevent mdsync() from ever completing.
This was an unforeseen consequence of a patch added in Mar 2006 to prevent
the fsync request queue from overflowing. Problem identified by Heikki
Linnakangas and independently by ITAGAKI Takahiro; fix based on ideas from
Takahiro-san, Heikki, and Tom.
Back-patch as far as 8.1 because a previous back-patch introduced the problem
into 8.1 ...
Tom Lane [Thu, 12 Apr 2007 15:04:41 +0000 (15:04 +0000)]
Cancel pending fsync requests during WAL replay of DROP DATABASE, per bug
report from David Darville. Back-patch as far as 8.1, which may or may not
have the problem but it seems a safe change anyway.
Tom Lane [Mon, 2 Apr 2007 18:49:36 +0000 (18:49 +0000)]
Fix check_sql_fn_retval to allow the case where a SQL function declared to
return void ends with a SELECT, if that SELECT has a single result that is
also of type void. Without this, it's hard to write a void function that
calls another void function. Per gripe from Peter.
Tom Lane [Fri, 30 Mar 2007 00:13:05 +0000 (00:13 +0000)]
Fix oversight in coding of _bt_start_vacuum: we can't assume that the LWLock
will be released by transaction abort before _bt_end_vacuum gets called.
If either of these "can't happen" errors actually happened, we'd freeze up
trying to acquire an already-held lock. Latest word is that this does
not explain Martin Pitt's trouble report, but it still looks like a bug.
Tom Lane [Sat, 17 Mar 2007 03:15:47 +0000 (03:15 +0000)]
SPI_cursor_open failed to enforce that only read-only queries could be
executed in read_only mode. This could lead to various relatively-subtle
failures, such as an allegedly stable function returning non-stable results.
Bug goes all the way back to the introduction of read-only mode in 8.0.
Per report from Gaetano Mendola.
Tom Lane [Wed, 14 Mar 2007 18:49:04 +0000 (18:49 +0000)]
Fix a longstanding bug in VACUUM FULL's handling of update chains. The code
did not expect that a DEAD tuple could follow a RECENTLY_DEAD tuple in an
update chain, but because the OldestXmin rule for determining deadness is a
simplification of reality, it is possible for this situation to occur
(implying that the RECENTLY_DEAD tuple is in fact dead to all observers,
but this patch does not attempt to exploit that). The code would follow a
chain forward all the way, but then stop before a DEAD tuple when backing
up, meaning that not all of the chain got moved. This could lead to copying
the chain multiple times (resulting in duplicate copies of the live tuple at
its end), or leaving dangling index entries behind (which, aside from
generating warnings from later vacuums, creates a risk of wrong query
results or bogus duplicate-key errors once the heap slot the index entry
points to is repopulated).
The fix is to recheck HeapTupleSatisfiesVacuum while following a chain
forward, and to stop if a DEAD tuple is reached. Each contiguous group
of RECENTLY_DEAD tuples will therefore be copied as a separate chain.
The patch also adds a couple of extra sanity checks to verify correct
behavior.
Tom Lane [Wed, 14 Mar 2007 17:38:15 +0000 (17:38 +0000)]
Arrange to install a "posixrules" entry in our timezone database, so that
POSIX-style timezone specs that don't exactly match any database entry will
be treated as having correct USA DST rules. Also, document that this can
be changed if you want to use some other DST rules with a POSIX zone spec.
We could consider changing localtime.c's TZDEFRULESTRING, but since that
facility can only deal with one DST transition rule, it seems fairly useless
now; might as well just plan to override it using a "posixrules" entry.
Backpatch as far as 8.0. There isn't much we can do in 7.x ... either your
libc gets it right, or it doesn't.
Alvaro Herrera [Sun, 11 Mar 2007 06:44:11 +0000 (06:44 +0000)]
Fix a race condition that caused pg_database_size() and pg_tablespace_size()
to fail if an object was removed between calls to ReadDir() and stat().
Per discussion in pgsql-hackers.
Magnus Hagander [Thu, 8 Mar 2007 19:27:48 +0000 (19:27 +0000)]
Remove unsafe calling of WSAStartup and WSACleanup from DllMain. Move the
inline cleanup call around so it will be called in the right order, and
be called on errors.
Tom Lane [Thu, 8 Mar 2007 17:03:43 +0000 (17:03 +0000)]
Fix vac_update_relstats to ensure it always sends a relcache inval message,
even if none of the fields in the pg_class row change. This behavior is
necessary to ensure other backends flush rd_targblock values that might
point to truncated-away pages. We got this right pre-8.2 but it was broken
by overoptimistic change to not write out the pg_class row if unchanged.
Per report from Pavan Deolasee.
Teodor Sigaev [Wed, 7 Mar 2007 21:25:18 +0000 (21:25 +0000)]
Athough cube is a varlena type, nowhere was a detoasting of cube's value, so
fix it. Add macroses DatumGetNDBOX, PG_GETARG_NDBOX and PG_RETURN_NDBOX.
Backpatch for 8.2 too.
Previous versions use version 0 calling conventions. And fmgr code detoast
values for user-defined functions.
Tom Lane [Tue, 6 Mar 2007 22:45:23 +0000 (22:45 +0000)]
Fix oversight in original coding of inline_function(): since
check_sql_fn_retval allows binary-compatibility cases, the expression
extracted from an inline-able SQL function might have a type that is only
binary-compatible with the declared function result type. To avoid possibly
changing the semantics of the expression, we should insert a RelabelType node
in such cases. This has only been shown to have bad consequences in recent
8.1 and up releases, but I suspect there may be failure cases in the older
branches too, so patch it all the way back. Per bug #3116 from Greg Mullane.
Along the way, fix an omission in eval_const_expressions_mutator: it failed
to copy the relabelformat field when processing a RelabelType. No known
observable failures from this, but it definitely isn't intended behavior.
Tom Lane [Thu, 1 Mar 2007 18:50:36 +0000 (18:50 +0000)]
Fix markQueryForLocking() to work correctly in the presence of nested views.
It has been wrong for this case since it was first written for 7.1 :-(
Per report from Pavel HanĂ¡k.