Bruce Momjian [Fri, 9 Feb 2007 03:37:45 +0000 (03:37 +0000)]
Done!
< * Merge xmin/xmax/cmin/cmax back into three header fields
<
< Before subtransactions, there used to be only three fields needed to
< store these four values. This was possible because only the current
< transaction looks at the cmin/cmax values. If the current transaction
< created and expired the row the fields stored where xmin (same as
< xmax), cmin, cmax, and if the transaction was expiring a row from a
< another transaction, the fields stored were xmin (cmin was not
< needed), xmax, and cmax. Such a system worked because a transaction
< could only see rows from another completed transaction. However,
< subtransactions can see rows from outer transactions, and once the
< subtransaction completes, the outer transaction continues, requiring
< the storage of all four fields. With subtransactions, an outer
< transaction can create a row, a subtransaction expire it, and when the
< subtransaction completes, the outer transaction still has to have
< proper visibility of the row's cmin, for example, for cursors.
<
< One possible solution is to create a phantom cid which represents a
< cmin/cmax pair and is stored in local memory. Another idea is to
< store both cmin and cmax only in local memory.
<
> * -Merge xmin/xmax/cmin/cmax back into three header fields
Tom Lane [Fri, 9 Feb 2007 03:35:35 +0000 (03:35 +0000)]
Combine cmin and cmax fields of HeapTupleHeaders into a single field, by
keeping private state in each backend that has inserted and deleted the same
tuple during its current top-level transaction. This is sufficient since
there is no need to be able to determine the cmin/cmax from any other
transaction. This gets us back down to 23-byte headers, removing a penalty
paid in 8.0 to support subtransactions. Patch by Heikki Linnakangas, with
minor revisions by moi, following a design hashed out awhile back on the
pghackers list.
Bruce Momjian [Fri, 9 Feb 2007 00:34:31 +0000 (00:34 +0000)]
Update:
< * Consider placing all sequences in a single table, now that system
< tables are full transactional
> * Consider placing all sequences in a single table
Tom Lane [Thu, 8 Feb 2007 18:37:30 +0000 (18:37 +0000)]
Fix an ancient logic error in plpgsql's exec_stmt_block: it thought it could
get away with not (re)initializing a local variable if the variable is marked
"isconst" and not "isnull". Unfortunately it makes this decision after having
already freed the old value, meaning that something like
for i in 1..10 loop
declare c constant text := 'hi there';
leads to subsequent accesses to freed memory, and hence probably crashes.
(In particular, this is why Asif Ali Rehman's bug leads to crash and not
just an unexpectedly-NULL value for SQLERRM: SQLERRM is marked CONSTANT
and so triggers this error.)
The whole thing seems wrong on its face anyway: CONSTANT means that you can't
change the variable inside the block, not that the initializer expression is
guaranteed not to change value across successive block entries. Hence,
remove the "optimization" instead of trying to fix it.
Tom Lane [Thu, 8 Feb 2007 18:37:14 +0000 (18:37 +0000)]
Rearrange use of plpgsql_add_initdatums() so that only the parsing of a
DECLARE section needs to know about it. Formerly, everyplace besides DECLARE
that created variables needed to do "plpgsql_add_initdatums(NULL)" to prevent
those variables from being sucked up as part of a subsequent DECLARE block.
This is obviously error-prone, and in fact the SQLSTATE/SQLERRM patch had
failed to do it for those two variables, leading to the bug recently exhibited
by Asif Ali Rehman: a DECLARE within an exception handler tried to reinitialize
SQLERRM.
Although the SQLSTATE/SQLERRM patch isn't in any pre-8.1 branches, and so
I can't point to a demonstrable failure there, it seems wise to back-patch
this into the older branches anyway, just to keep the logic similar to HEAD.
Bruce Momjian [Thu, 8 Feb 2007 15:28:58 +0000 (15:28 +0000)]
Win32 regression test fixes:
For win32 in general, this makes it possible to run the regression tests
as an admin user by using the same restricted token method that's used
by pg_ctl and initdb.
For vc++, it adds building of pg_regress.exe, adds a resultmap, and
fixes how it runs the install.
Bruce Momjian [Thu, 8 Feb 2007 15:16:19 +0000 (15:16 +0000)]
Add /contrib/pg_standby:
pg_standby is a production-ready program that can be used to
create a Warm Standby server. Other configuration is required
as well, all of which is described in the main server manual.
Bruce Momjian [Thu, 8 Feb 2007 15:09:47 +0000 (15:09 +0000)]
Add /contrib/pg_standby:
pg_standby is a production-ready program that can be used to
create a Warm Standby server. Other configuration is required
as well, all of which is described in the main server manual.
Normalize fgets() calls to use sizeof() for calculating the buffer size
where possible, and fix some sites that apparently thought that fgets()
will overwrite the buffer by one byte.
Also add some strlcpy() to eliminate some weird memory handling.
Bruce Momjian [Thu, 8 Feb 2007 05:05:53 +0000 (05:05 +0000)]
Reduce WAL activity for page splits:
> Currently, an index split writes all the data on the split page to
> WAL. That's a lot of WAL traffic. The tuples that are copied to the
> right page need to be WAL logged, but the tuples that stay on the
> original page don't.
Tom Lane [Wed, 7 Feb 2007 23:11:30 +0000 (23:11 +0000)]
Add a function pg_stat_clear_snapshot() that discards any statistics snapshot
already collected in the current transaction; this allows plpgsql functions to
watch for stats updates even though they are confined to a single transaction.
Use this instead of the previous kluge involving pg_stat_file() to wait for
the stats collector to update in the stats regression test. Internally,
decouple storage of stats snapshots from transaction boundaries; they'll
now stick around until someone calls pgstat_clear_snapshot --- which xact.c
still does at transaction end, to maintain the previous behavior. This makes
the logic a lot cleaner, at the price of a couple dozen cycles per transaction
exit.
Tom Lane [Wed, 7 Feb 2007 18:34:56 +0000 (18:34 +0000)]
Modify the stats regression test to delay until the stats file actually
changes (with an upper limit of 30 seconds), and record the delay time in
the postmaster log. This should give us some info about what's happening
with the intermittent stats failures in buildfarm. After an idea of
Andrew Dunstan's.
Tom Lane [Wed, 7 Feb 2007 16:44:48 +0000 (16:44 +0000)]
Remove the xlog-centric "database system is ready" message and replace it with
"database system is ready to accept connections", which is issued by the
postmaster when it really is ready to accept connections. Per proposal from
Markus Schiltknecht and subsequent discussion.
Tom Lane [Tue, 6 Feb 2007 22:49:24 +0000 (22:49 +0000)]
Fix an error in the original coding of holdable cursors: PersistHoldablePortal
thought that it didn't have to reposition the underlying tuplestore if the
portal is atEnd. But this is not so, because tuplestores have separate read
and write cursors ... and the read cursor hasn't moved from the start.
This mistake explains bug #2970 from William Zhang.
Note: the coding here is pretty inefficient, but given that no one has noticed
this bug until now, I'd say hardly anyone uses the case where the cursor has
been advanced before being persisted. So maybe it's not worth worrying about.
Bruce Momjian [Tue, 6 Feb 2007 18:31:26 +0000 (18:31 +0000)]
Update timezone FAQ item:
<P>USA saving time changes are included in PostgreSQL release 8.0.[4+],
and all later major releases, e.g. 8.1. Canada and Western Australia
changes are included in 8.0.[10+], 8.1.[6+], and all later major
releases. PostgreSQL releases prior to 8.0 use the operating system's
timezone database for daylight saving information.</P>
Tom Lane [Tue, 6 Feb 2007 17:35:20 +0000 (17:35 +0000)]
Remove typmod checking from the recent security-related patches. It turns
out that ExecEvalVar and friends don't necessarily have access to a tuple
descriptor with correct typmod: it definitely can contain -1, and possibly
might contain other values that are different from the Var's value.
Arguably this should be cleaned up someday, but it's not a simple change,
and in any case typmod discrepancies don't pose a security hazard.
Per reports from numerous people :-(
I'm not entirely sure whether the failure can occur in 8.0 --- the simple
test cases reported so far don't trigger it there. But back-patch the
change all the way anyway.
Move NAMEDATALEN definition from postgres_ext.h to pg_config_manual.h. It
used to be part of libpq's exported interface many releases ago, but now
it's no longer necessary to make it accessible to clients.
Tom Lane [Tue, 6 Feb 2007 06:50:26 +0000 (06:50 +0000)]
Fix a performance regression in 8.2: optimization of MIN/MAX into indexscans
had stopped working for tables buried inside views or sub-selects. This is
because I had gotten rid of the simplify_jointree() preprocessing step, and
optimize_minmax_aggregates() wasn't smart enough to deal with a non-canonical
FromExpr. Per gripe from Bill Howe.
Tom Lane [Tue, 6 Feb 2007 02:59:15 +0000 (02:59 +0000)]
Add support for cross-type hashing in hashed subplans (hashed IN/NOT IN cases
that aren't turned into true joins). Since this is the last missing bit of
infrastructure, go ahead and fill out the hash integer_ops and float_ops
opfamilies with cross-type operators. The operator family project is now
DONE ... er, except for documentation ...
Bruce Momjian [Mon, 5 Feb 2007 17:17:13 +0000 (17:17 +0000)]
Updated TODO item:
> o Add a \set variable to control whether \s displays line numbers
> Another option is to add \# which lists line numbers, and
> allows command execution.
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php
Tom Lane [Mon, 5 Feb 2007 04:22:18 +0000 (04:22 +0000)]
Rename MaxTupleSize to MaxHeapTupleSize to clarify that it's not meant to
describe the maximum size of index tuples (which is typically AM-dependent
anyway); and consequently remove the bogus deduction for "special space"
that was built into it.
Adjust TOAST_TUPLE_THRESHOLD and TOAST_MAX_CHUNK_SIZE to avoid wasting two
bytes per toast chunk, and to ensure that the calculation correctly tracks any
future changes in page header size. The computation had been inaccurate in a
way that didn't cause any harm except space wastage, but future changes could
have broken it more drastically.
Fix the calculation of BTMaxItemSize, which was formerly computed as 1 byte
more than it could safely be. This didn't cause any harm in practice because
it's only compared against maxalign'd lengths, but future changes in the size
of page headers or btree special space could have exposed the problem.
initdb forced because of change in TOAST_MAX_CHUNK_SIZE, which alters the
storage of toast tables.
Tom Lane [Sun, 4 Feb 2007 20:00:37 +0000 (20:00 +0000)]
Don't MAXALIGN in the checks to decide whether a tuple is over TOAST's
threshold for tuple length. On 4-byte-MAXALIGN machines, the toast code
creates tuples that have t_len exactly TOAST_TUPLE_THRESHOLD ... but this
number is not itself maxaligned, so if heap_insert maxaligns t_len before
comparing to TOAST_TUPLE_THRESHOLD, it'll uselessly recurse back to
tuptoaster.c, wasting cycles. (It turns out that this does not happen on
8-byte-MAXALIGN machines, because for them the outer MAXALIGN in the
TOAST_MAX_CHUNK_SIZE macro reduces TOAST_MAX_CHUNK_SIZE so that toast tuples
will be less than TOAST_TUPLE_THRESHOLD in size. That MAXALIGN is really
incorrect, but we can't remove it now, see below.) There isn't any particular
value in maxaligning before comparing to the thresholds, so just don't do
that, which saves a small number of cycles in itself.
These numbers should be rejiggered to minimize wasted space on toast-relation
pages, but we can't do that in the back branches because changing
TOAST_MAX_CHUNK_SIZE would force an initdb (by changing the contents of toast
tables). We can move the toast decision thresholds a bit, though, which is
what this patch effectively does.
Thanks to Pavan Deolasee for discovering the unintended recursion.
Back-patch into 8.2, but not further, pending more testing. (HEAD is about
to get a further patch modifying the thresholds, so it won't help much
for testing this form of the patch.)
Bruce Momjian [Sat, 3 Feb 2007 23:52:19 +0000 (23:52 +0000)]
Add URLs for:
* Allow sequential scans to take advantage of other concurrent
sequential scans, also called "Synchronised Scanning"
> http://archives.postgresql.org/pgsql-patches/2006-12/msg00076.php
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00408.php
Bruce Momjian [Sat, 3 Feb 2007 22:32:49 +0000 (22:32 +0000)]
Add:
> o Allow recovery.conf to allow the same syntax as
> postgresql.conf, including quoting
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00497.php
Bruce Momjian [Fri, 2 Feb 2007 23:05:36 +0000 (23:05 +0000)]
Add URL for:
* Allow sequential scans to take advantage of other concurrent
sequential scans, also called "Synchronised Scanning"
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00784.php
Bruce Momjian [Fri, 2 Feb 2007 22:55:08 +0000 (22:55 +0000)]
Add:
> * Reduce checkpoint performance degredation by forcing data to disk
> more evenly
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00337.php
> http://archives.postgresql.org/pgsql-hackers/2007-01/msg00079.php
Neil Conway [Fri, 2 Feb 2007 16:25:34 +0000 (16:25 +0000)]
This patch changes the installscript for vcbuild to actually parse the
generated solution files for what to install, instead of blindly copying
everything as it previously did. With the previous quick-n-dirty
version, it would copy old DLLs if you reconfigured in a way that didn't
include subprojects like a PL for example.
Bruce Momjian [Fri, 2 Feb 2007 05:42:56 +0000 (05:42 +0000)]
Add:
> o Allow column display reordering by recording a display,
> storage, and permanent id for every column?
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00782.php
>
Tom Lane [Fri, 2 Feb 2007 00:07:03 +0000 (00:07 +0000)]
Repair failure to check that a table is still compatible with a previously
made query plan. Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has. Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor. Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query. However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend. Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Tom Lane [Fri, 2 Feb 2007 00:02:55 +0000 (00:02 +0000)]
Repair insufficiently careful type checking for SQL-language functions:
we should check that the function code returns the claimed result datatype
every time we parse the function for execution. Formerly, for simple
scalar result types we assumed the creation-time check was sufficient, but
this fails if the function selects from a table that's been redefined since
then, and even more obviously fails if check_function_bodies had been OFF.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Neil Conway [Thu, 1 Feb 2007 20:11:18 +0000 (20:11 +0000)]
Update some of the "expected" regression test results for Bruce's
recent may/might cleanup, in the hopes that this will unbreak the
buildfarm. Per report from Stefan Kaltenbrunner.
Tom Lane [Thu, 1 Feb 2007 19:22:07 +0000 (19:22 +0000)]
Fix plpgsql so that when a local variable has no initial-value expression,
an error will be thrown correctly if the variable is of a NOT NULL domain.
Report and almost-correct fix from Sergiy Vyshnevetskiy (bug #2948).
Bruce Momjian [Thu, 1 Feb 2007 19:10:30 +0000 (19:10 +0000)]
Wording cleanup for error messages. Also change can't -> cannot.
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Neil Conway [Thu, 1 Feb 2007 04:39:33 +0000 (04:39 +0000)]
This patch adds documentation for the long-version parameters --username
and --password for pg_dump, pg_dumpall and pg_restore, per complaint by
Michael Schmidt. Patch from Magnus Hagander.
Bruce Momjian [Thu, 1 Feb 2007 04:35:52 +0000 (04:35 +0000)]
Add:
>
> * Fix problem when multiple subtransactions of the same outer transaction
> hold different types of locks, and one subtransaction aborts
>
> http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00001.php
Bruce Momjian [Thu, 1 Feb 2007 00:34:03 +0000 (00:34 +0000)]
Update CREATE SEQUENCE documentation to show the same sequence being
created and increments. The old docs created the sequence, then showed
a nextval() of 114.