Tom Lane [Fri, 2 Feb 2007 00:03:30 +0000 (00:03 +0000)]
Repair insufficiently careful type checking for SQL-language functions:
we should check that the function code returns the claimed result datatype
every time we parse the function for execution. Formerly, for simple
scalar result types we assumed the creation-time check was sufficient, but
this fails if the function selects from a table that's been redefined since
then, and even more obviously fails if check_function_bodies had been OFF.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Tom Lane [Tue, 30 Jan 2007 22:05:25 +0000 (22:05 +0000)]
Repair oversights in the mechanism used to store compiled plpgsql functions.
The original coding failed (tried to access deallocated memory) if there were
two active call sites (fn_extra pointers) for the same function and the
function definition was updated. Also, if an update of a recursive function
was detected upon nested entry to the function, the existing compiled version
was summarily deallocated, resulting in crash upon return to the outer
instance. Problem observed while studying a bug report from Sergiy
Vyshnevetskiy.
Bug does not exist before 8.1 since older versions just leaked the memory of
obsoleted compiled functions, rather than trying to reclaim it.
Tom Lane [Tue, 30 Jan 2007 18:02:34 +0000 (18:02 +0000)]
Add SPI_push/SPI_pop calls so that datatype input and output functions called
by plpgsql can themselves use SPI --- possibly indirectly, as in the case
of domain_in() invoking plpgsql functions in a domain check constraint.
Per bug #2945 from Sergiy Vyshnevetskiy.
Somewhat arbitrarily, I've chosen to back-patch this as far as 8.0. Given
the lack of prior complaints, it doesn't seem critical for 7.x.
Tom Lane [Sat, 27 Jan 2007 20:53:41 +0000 (20:53 +0000)]
Correct an old logic error in btree page splitting: when considering a split
exactly at the point where we need to insert a new item, the calculation used
the wrong size for the "high key" of the new left page. This could lead to
choosing an unworkable split, resulting in "PANIC: failed to add item to the
left sibling" (or "right sibling") failure. Although this bug has been there
a long time, it's very difficult to trigger a failure before 8.2, since there
was generally a lot of free space on both sides of a chosen split. In 8.2,
where the user-selected fill factor determines how much free space the code
tries to leave, an unworkable split is much more likely. Report by Joe
Conway, diagnosis and fix by Heikki Linnakangas.
Tom Lane [Sat, 27 Jan 2007 20:15:55 +0000 (20:15 +0000)]
Back-port changes of Jan 16 and 17 to "revoke" pending fsync requests during
DROP TABLE and DROP DATABASE. Should prevent unexpected "permission denied"
failures on Windows, and is cleaner on other platforms too since we no longer
have to take it on faith that ENOENT is okay during an fsync attempt.
Patched as far back as 8.1; per recent discussion I think we are not going
to worry about Windows-specific issues in 8.0 anymore.
Tom Lane [Wed, 24 Jan 2007 17:12:29 +0000 (17:12 +0000)]
Get pg_utf_mblen(), pg_utf2wchar_with_len(), and utf2ucs() all on the same
page about the maximum UTF8 sequence length we support (4 bytes since 8.1,
3 before that). pg_utf2wchar_with_len never got updated to support 4-byte
characters at all, and in any case had a buffer-overrun risk in that it
could produce multiple pg_wchars from what mblen claims to be just one UTF8
character. The only reason we don't have a major security hole is that most
callers allocate worst-case output buffers; the sole exception in released
versions appears to be pre-8.2 iwchareq() (ie, ILIKE), which can be crashed
due to zeroing out its return address --- but AFAICS that can't be exploited
for anything more than a crash, due to inability to control what gets written
there. Per report from James Russell and Michael Fuhr.
Pre-8.1 the risk is much less, but I still think pg_utf2wchar_with_len's
behavior given an incomplete final character risks buffer overrun, so
back-patch that logic change anyway.
This patch also makes sure that UTF8 sequences exceeding the supported
length (whichever it is) are consistently treated as error cases, rather
than being treated like a valid shorter sequence in some places.
Tom Lane [Wed, 24 Jan 2007 01:25:56 +0000 (01:25 +0000)]
Relax an Assert() that has been found to be too strict in some situations
involving unions of types having typmods. Variants of the failure are known
to occur in 8.1 and up; not sure if it's possible in 8.0 and 7.4, but since
the code exists that far back, I'll just patch 'em all. Per report from
Brian Hurt.
Tom Lane [Thu, 11 Jan 2007 23:06:16 +0000 (23:06 +0000)]
Fix a performance problem in databases with large numbers of tables
(or other types of pg_class entry): the function pgstat_vacuum_tabstat,
invoked during VACUUM startup, had runtime proportional to the number of
stats table entries times the number of pg_class rows; in other words
O(N^2) if the stats collector's information is reasonably complete.
Replace list searching with a hash table to bring it back to O(N)
behavior. Per report from kim at myemma.com.
Back-patch as far as 8.1; 8.0 and before use different coding here.
Tatsuo Ishii [Wed, 10 Jan 2007 01:44:30 +0000 (01:44 +0000)]
Back port patch.
Call srandom() instead of srand().
pgbench calls random() later, so it should have called srandom().
On most platforms except Windows srandom() is actually identical
to srand(), so the bug only bites Windows users.
per bug report from Akio Ishida.
Tom Lane [Wed, 3 Jan 2007 22:39:42 +0000 (22:39 +0000)]
Fix regex_fixed_prefix() to cope reasonably well with regex patterns of the
form '^(foo)$'. Before, these could never be optimized into indexscans.
The recent changes to make psql and pg_dump generate such patterns (for \d
commands and -t and related switches, respectively) therefore represented
a big performance hit for people with large pg_class catalogs, as seen in
recent gripe from Erik Jones. While at it, be more paranoid about
case-sensitivity checking in multibyte encodings, and fix some other
corner cases in which a regex might be interpreted too liberally.
Tom Lane [Wed, 27 Dec 2006 22:32:03 +0000 (22:32 +0000)]
Modify local buffer management to request memory for local buffers in blocks
of increasing size, instead of one at a time. This reduces the memory
management overhead when num_temp_buffers is large: in the previous coding
we would actually waste 50% of the space used for temp buffers, because aset.c
would round the individual requests up to 16K. Problem noted while studying
a performance issue reported by Steven Flatt.
Back-patch as far as 8.1 --- older versions used few enough local buffers
that the issue isn't significant for them.
Tom Lane [Tue, 26 Dec 2006 19:27:04 +0000 (19:27 +0000)]
Repair bug #2839: the various ExecReScan functions need to reset
ps_TupFromTlist in plan nodes that make use of it. This was being done
correctly in join nodes and Result nodes but not in any relation-scan nodes.
Bug would lead to bogus results if a set-returning function appeared in the
targetlist of a subquery that could be rescanned after partial execution,
for example a subquery within EXISTS(). Bug has been around forever :-(
... surprising it wasn't reported before.
Tom Lane [Fri, 1 Dec 2006 20:49:59 +0000 (20:49 +0000)]
Document the recently-understood hazard that a rollback can release row-level
locks that logically should not be released, because when a subtransaction
overwrites XMAX all knowledge of the previous lock state is lost. It seems
unlikely that we will be able to fix this before 8.3...
Tom Lane [Tue, 28 Nov 2006 19:37:13 +0000 (19:37 +0000)]
Update timezone data to tzdata2006p zic distribution. It seems Western
Australia decided to institute DST with one month's notice ... way to go,
politicians.
Tom Lane [Tue, 28 Nov 2006 19:18:56 +0000 (19:18 +0000)]
Mark to_number() and the numeric-type variants of to_char() as stable, not
immutable, because their results depend on lc_numeric; this is a longstanding
oversight. We cannot force initdb for this in the back branches, but we can
at least provide correct catalog entries for future installations.
Tom Lane [Tue, 28 Nov 2006 05:54:32 +0000 (05:54 +0000)]
Back-patch HEAD's fixes to recognize __ppc64__ as equivalent to __powerpc64__.
Per confirmation from Brian Wipf that this is correct and necessary for
Darwin 64-bit.
Tom Lane [Tue, 28 Nov 2006 05:47:16 +0000 (05:47 +0000)]
Add $(CFLAGS) to the simplified build rule for .so libraries on Darwin.
Arguably we should do this on *all* platforms, but for the moment I'll be
conservative and just do it where it's demonstrably needed. Per report
from Brian Wipf.
Tom Lane [Fri, 24 Nov 2006 23:06:56 +0000 (23:06 +0000)]
Fix psql's \copy command to ensure that it cycles libpq back to the idle state
(in particular, causing the ReadyForQuery message to be eaten) before
returning from do_copy. The only known consequence of failing to do so is
that get_prompt might show a wrong result for the %x transaction status
escape, as reported by Bernd Helmle; but it's possible there are other issues.
Back-patch as far as 7.4, the oldest version supporting %x.
Tom Lane [Wed, 22 Nov 2006 21:12:57 +0000 (21:12 +0000)]
Fix 1-byte buffer overrun when OID exceeds 1 billion. This probably can't
cause any serious harm in normal cases, but if you have gcc buffer overrun
checking turned on, that will notice. Found by Jack Orenstein. Problem
was already fixed in CVS HEAD.
Tom Lane [Mon, 20 Nov 2006 01:08:02 +0000 (01:08 +0000)]
When truncating a relation in-place (eg during VACUUM), do not try to unlink
any no-longer-needed segments; just truncate them to zero bytes and leave
the files in place for possible future re-use. This avoids problems when
the segments are re-used due to relation growth shortly after truncation.
Before, the bgwriter, and possibly other backends, could still be holding
open file references to the old segment files, and would write dirty blocks
into those files where they'd disappear from the view of other processes.
Back-patch as far as 8.0. I believe the 7.x branches are not vulnerable,
because they had no bgwriter, and "blind" writes by other backends would
always be done via freshly-opened file references.
Tom Lane [Sun, 19 Nov 2006 21:33:29 +0000 (21:33 +0000)]
Repair problems with hash indexes that span multiple segments: the hash code's
preference for filling pages out-of-order tends to confuse the sanity checks
in md.c, as per report from Balazs Nagy in bug #2737. The fix is to ensure
that the smgr-level code always has the same idea of the logical EOF as the
hash index code does, by using ReadBuffer(P_NEW) where we are adding a single
page to the end of the index, and using smgrextend() to reserve a large batch
of pages when creating a new splitpoint. The patch is a bit ugly because it
avoids making any changes in md.c, which seems the most prudent approach for a
backpatchable beta-period fix. After 8.3 development opens, I'll take a look
at a cleaner but more invasive patch, in particular getting rid of the now
unnecessary hack to allow reading beyond EOF in mdread().
Backpatch as far as 7.4. The bug likely exists in 7.3 as well, but because
of the magnitude of the 7.3-to-7.4 changes in hash, the later-version patch
doesn't even begin to apply. Given the other known bugs in the 7.3-era hash
code, it does not seem worth trying to develop a separate patch for 7.3.
Tom Lane [Fri, 17 Nov 2006 18:00:25 +0000 (18:00 +0000)]
Repair two related errors in heap_lock_tuple: it was failing to recognize
cases where we already hold the desired lock "indirectly", either via
membership in a MultiXact or because the lock was originally taken by a
different subtransaction of the current transaction. These cases must be
accounted for to avoid needless deadlocks and/or inappropriate replacement of
an exclusive lock with a shared lock. Per report from Clarence Gardner and
subsequent investigation.
Tom Lane [Mon, 6 Nov 2006 18:21:38 +0000 (18:21 +0000)]
Repair bug #2694 concerning an ARRAY[] construct whose inputs are empty
sub-arrays. Per discussion, if all inputs are empty arrays then result
must be an empty array too, whereas a mix of empty and nonempty arrays
should (and already did) draw an error. In the back branches, the
construct was strict: any NULL input immediately yielded a NULL output;
so I left that behavior alone. HEAD was simply ignoring NULL sub-arrays,
which doesn't seem very sensible. For lack of a better idea it now
treats NULL sub-arrays the same as empty ones.
Tom Lane [Sun, 5 Nov 2006 23:40:38 +0000 (23:40 +0000)]
Fix recently-identified PITR recovery hazard: the base backup could contain
stale relcache init files (pg_internal.init), and there is no mechanism for
updating them during WAL replay. Easiest solution is just to delete the init
files at conclusion of startup, and let the first backend started in each
database take care of rebuilding the init file. Simon Riggs and Tom Lane.
Back-patched to 8.1. Arguably this should be fixed in 8.0 too, but it would
require significantly more code since 8.0 has no handy startup-time scan of
pg_database to piggyback on. Manual solution of the problem is possible
in 8.0 (just delete the pg_internal.init files before starting WAL replay),
so that may be a sufficient answer.
Tom Lane [Sat, 4 Nov 2006 18:20:40 +0000 (18:20 +0000)]
Correct documentation error: in 8.1 and 8.2, %p in archive and restore
command strings inserts relative not absolute path of file to process.
This is a side-effect of 2005-07-04 change that makes the server use
relative paths in general. Noted by Bernd Helmle.
Tom Lane [Wed, 1 Nov 2006 19:50:03 +0000 (19:50 +0000)]
Fix "failed to re-find parent key" btree VACUUM failure by tweaking
_bt_pagedel to recover from the failure: just search the whole parent level
if searching to the right fails. This does nothing for the underlying problem
that index keys became out-of-order in the grandparent level. However, we
believe that there is no other consequence worse than slightly inefficient
searching, so this narrow patch seems like the safest solution for the back
branches.
Tom Lane [Wed, 1 Nov 2006 15:59:31 +0000 (15:59 +0000)]
pg_restore failed on tar-format archives if they contained large objects
(blobs) with comments, per bug #2727 from Konstantin Pelepelin.
Mea culpa for not having tested this case.
Back-patch to 8.1; prior branches don't dump blob comments at all.
Tom Lane [Thu, 19 Oct 2006 17:26:37 +0000 (17:26 +0000)]
Work around reported problem that AIX's getaddrinfo() doesn't seem to zero
sin_port in the returned IP address struct when servname is NULL. This has
been observed to cause failure to bind the stats collection socket, and
could perhaps cause other issues too. Per reports from Brad Nicholson
and Chris Browne.
Teodor Sigaev [Fri, 13 Oct 2006 14:00:17 +0000 (14:00 +0000)]
Fix infinite sleep and failes of send in Win32.
1) pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx now called with
finite timeout (100ms) in case of FP_WRITE and UDP socket. If timeout occurs
then pgwin32_waitforsinglesocket() tries to write empty packet goes to
WaitForMultipleObjectsEx again.
2) pgwin32_send(): add loop around WSASend and pgwin32_waitforsinglesocket().
The reason is: for overlapped socket, 'ok' result from
pgwin32_waitforsinglesocket() isn't guarantee that socket is still free,
it can become busy again and following WSASend call will fail with
WSAEWOULDBLOCK error.
See http://archives.postgresql.org/pgsql-hackers/2006-10/msg00561.php
Tom Lane [Thu, 12 Oct 2006 17:02:28 +0000 (17:02 +0000)]
Fix mishandling of after-trigger state when a SQL function returns multiple
rows --- if the surrounding query queued any trigger events between the rows,
the events would be fired at the wrong time, leading to bizarre behavior.
Per report from Merlin Moncure.
This is a simple patch that should solve the problem fully in the back
branches, but in HEAD we also need to consider the possibility of queries
with RETURNING clauses. Will look into a fix for that separately.
Tom Lane [Wed, 11 Oct 2006 20:21:11 +0000 (20:21 +0000)]
Repair incorrect check for coercion of unknown literal to ANYARRAY, a bug
I introduced in 7.4.1 :-(. It's correct to allow unknown to be coerced to
ANY or ANYELEMENT, since it's a real-enough data type, but it most certainly
isn't an array datatype. This can cause a backend crash but AFAICT is not
exploitable as a security hole. Per report from Michael Fuhr.
Note: as fixed in HEAD, this changes a constant in the pg_stats view,
resulting in a change in the expected regression outputs. The back-branch
patches have been hacked to avoid that, so that pre-existing installations
won't start failing their regression tests.
Tom Lane [Wed, 11 Oct 2006 20:03:11 +0000 (20:03 +0000)]
CREATE TABLE ... LIKE ... should mark the columns it creates with
attislocal = true, since they are not really inherited but merely copied
from the original table. I'm not sure if there are any cases where it makes
a real difference given the existing uses of the flag, but wrong is wrong.
This was fixed in passing in HEAD by the LIKE INCLUDING CONSTRAINTS patch,
but never back-patched.
Bruce Momjian [Tue, 10 Oct 2006 20:11:44 +0000 (20:11 +0000)]
Restore HPUX FAQ entry that talked about working around regression
script problems, because in 8.1.X, the regression test is still a
script. Patch to 8.1.X only.
Tom Lane [Tue, 10 Oct 2006 16:15:22 +0000 (16:15 +0000)]
Fix psql \d commands to behave properly when a pattern using regex | is given.
Formerly they'd emit '^foo|bar$' which is wrong because the anchors are
parsed as part of the alternatives; must emit '^(foo|bar)$' to get expected
behavior. Same as bug found previously in similar_escape(). Already fixed
in HEAD, this is just back-porting the part of that patch that was a bug fix.
Tom Lane [Mon, 9 Oct 2006 01:45:41 +0000 (01:45 +0000)]
Fix back-branch pg_regress scripts to try the "canonical" expected file if we
tried a variant file from resultmap and it didn't match. This is already done
in HEAD's C-code version, and is needed because OpenBSD has recently migrated
to a more standard handling of float underflow --- see buildfarm results
from emu.
Tom Lane [Sat, 7 Oct 2006 22:21:44 +0000 (22:21 +0000)]
Fix ancient oversight in psql's \d pattern processing code: when seeing two
quote chars inside quote marks, should emit one quote *and stay in inquotes
mode*. No doubt the lack of reports of this have something to do with the
poor documentation of the feature ...
Tom Lane [Sat, 7 Oct 2006 00:11:59 +0000 (00:11 +0000)]
Fix string_to_array() to correctly handle the case where there are
overlapping possible matches for the separator string, such as
string_to_array('123xx456xxx789', 'xx').
Also, revise the logic of replace(), split_part(), and string_to_array()
to avoid O(N^2) work from redundant searches and conversions to pg_wchar
format when there are N matches to the separator string.
Backpatched the full patch as far as 8.0. 7.4 also has the bug, but the
code has diverged a lot, so I just went for a quick-and-dirty fix of the
bug itself in that branch.
Tom Lane [Fri, 6 Oct 2006 18:23:41 +0000 (18:23 +0000)]
Fix SysCacheGetAttr() to handle the case where the specified syscache has not
been initialized yet. This can happen because there are code paths that call
SysCacheGetAttr() on a tuple originally fetched from a different syscache
(hopefully on the same catalog) than the one specified in the call. It
doesn't seem useful or robust to try to prevent that from happening, so just
improve the function to cope instead. Per bug#2678 from Jeff Trout. The
specific example shown by Jeff is new in 8.1, but to be on the safe side
I'm backpatching 8.0 as well. We could patch 7.x similarly but I think
that's probably overkill, given the lack of evidence of old bugs of this ilk.
Tom Lane [Sun, 1 Oct 2006 17:23:51 +0000 (17:23 +0000)]
Fix overly enthusiastic Assert introduced in 8.1: it's expecting a
CaseTestExpr, but forgot that the optimizer is sometimes able to replace
CaseTestExpr by Const.
Tom Lane [Thu, 31 Aug 2006 17:31:40 +0000 (17:31 +0000)]
Clean up rather sloppy fix in HEAD for the ancient bug that CREATE CONVERSION
didn't create a dependency from the new conversion to its schema. Back-patch
to all supported releases.
Tom Lane [Mon, 14 Aug 2006 00:46:59 +0000 (00:46 +0000)]
Get rid of "lookahead" functionality in plpgsql's yylex() function,
and instead make the grammar production for the RETURN statement do the
heavy lifting. The lookahead idea was copied from the main parser, but
it does not work in plpgsql's parser because here gram.y looks explicitly
at the scanner's yytext variable, which will be out of sync after a
failed lookahead step. A minimal example is
create or replace function foo() returns void language plpgsql as '
begin
perform return foo bar;
end';
which can be seen by testing to deliver "foo foo bar" to the main parser
instead of the expected "return foo bar". This isn't a huge bug since
RETURN is not found in the main grammar, but it could bite someone who
tried to use "return" as an identifier.
Back-patch to 8.1. Bug exists further back, but HEAD patch doesn't apply
cleanly, and given the lack of field complaints it doesn't seem worth
the effort to develop adjusted patches.
Tom Lane [Sun, 13 Aug 2006 22:18:22 +0000 (22:18 +0000)]
Fix core dump in duration logging for a V3-protocol Execute message
when what's being executed is a COMMIT or ROLLBACK. Per report from
Sergey Koposov. Backpatch to 8.1; 8.0 and before don't have the bug
due to lack of any logging at all here.
Bruce Momjian [Wed, 9 Aug 2006 17:47:06 +0000 (17:47 +0000)]
Fix statement_timeout on Win32 so that it properly treats micro-seconds
as micro-seconds, rather than as 100 microseconds, as it does now. This
actually fixes all setitimer calls on Win32, but statement_timeout is
the most visible fix.
Tom Lane [Wed, 2 Aug 2006 16:30:00 +0000 (16:30 +0000)]
Fix documentation error: GRANT/REVOKE for roles only accept role names
as grantees, not PUBLIC ... and you can't say GROUP either. Noted by
Brian Hurt.
Andrew Dunstan [Sat, 29 Jul 2006 20:00:00 +0000 (20:00 +0000)]
prevent multiplexing Windows kernel event objects we listen for across various sockets - should fix the occasional stats test regression failures we see.
Tom Lane [Sun, 23 Jul 2006 18:34:50 +0000 (18:34 +0000)]
Fix oversight in sizing of shared buffer lookup hashtable. Because
BufferAlloc tries to insert a new mapping entry before deleting the old one
for a buffer, we have a transient need for more than NBuffers entries ---
one more in 8.1, and as many as NUM_BUFFER_PARTITIONS more in CVS HEAD.
In theory this could lead to an "out of shared memory" failure if shmem
had already been completely claimed by the time the extra entries were
needed.
Tom Lane [Sat, 22 Jul 2006 21:04:46 +0000 (21:04 +0000)]
Hmm, seems --disable-spinlocks has been broken for awhile and nobody
noticed. Fix SpinlockSemas() to report the correct count considering
that PG 8.1 adds a spinlock to each shared-buffer header.
Tom Lane [Thu, 20 Jul 2006 00:46:56 +0000 (00:46 +0000)]
Don't try to truncate multixact SLRU files in checkpoints done during xlog
recovery. In the first place, it doesn't work because slru's
latest_page_number isn't set up yet (this is why we've been hearing reports
of strange "apparent wraparound" log messages during crash recovery, but
only from people who'd managed to advance their next-mxact counters some
considerable distance from 0). In the second place, it seems a bit unwise
to be throwing away data during crash recovery anwyway. This latter
consideration convinces me to just disable truncation during recovery,
rather than computing latest_page_number and pushing ahead.
Tom Lane [Sun, 16 Jul 2006 18:17:23 +0000 (18:17 +0000)]
Ensure that we retry rather than erroring out when send() or recv() return
EINTR; the stats code was failing to do this and so were a couple of places
in the postmaster. The stats code assumed that recv() could not return EINTR
if a preceding select() showed the socket to be read-ready, but this is
demonstrably false with our Windows implementation of recv(), and it may
not be the case on all Unix variants either. I think this explains the
intermittent stats regression test failures we've been seeing, as well
as reports of stats collector instability under high load on Windows.