Improve vacuum code to track minimum Xids per table instead of per database.
To this end, add a couple of columns to pg_class, relminxid and relvacuumxid,
based on which we calculate the pg_database columns after each vacuum.
We now force all databases to be vacuumed, even template ones. A backend
noticing too old a database (meaning pg_database.datminxid is in danger of
falling behind Xid wraparound) will signal the postmaster, which in turn will
start an autovacuum iteration to process the offending database. In principle
this is only there to cope with frozen (non-connectable) databases without
forcing users to set them to connectable, but it could force regular user
database to go through a database-wide vacuum at any time. Maybe we should
warn users about this somehow. Of course the real solution will be to use
autovacuum all the time ;-)
There are some additional improvements we could have in this area: for example
the vacuum code could be smarter about not updating pg_database for each table
when called by autovacuum, and do it only once the whole autovacuum iteration
is done.
I updated the system catalogs documentation, but I didn't modify the
maintenance section. Also having some regression tests for this would be nice
but it's not really a very straightforward thing to do.
Catalog version bumped due to system catalog changes.
Bruce Momjian [Thu, 6 Jul 2006 02:18:07 +0000 (02:18 +0000)]
Add index information to /contrib/pgstattuple:
This is an extension of pgstattuple to query information from indexes.
It supports btree, hash and gist. Gin is not supported. It scans only
index pages and does not read corresponding heap tuples. Therefore,
'dead_tuple' means the number of tuples with LP_DELETE flag.
Also, I added an experimental feature for btree indexes. It checks
fragmentation factor of indexes. If an leaf has the right link on the
next adjacent page in the file, it is assumed to be continuous (not
fragmented). It will help us to decide when to REINDEX.
Bruce Momjian [Thu, 6 Jul 2006 01:57:25 +0000 (01:57 +0000)]
Backpatch dbmirror fix for escape handling.
> Upstream confirmed my reply in the last mail in [1]: the complete
> escaping logic in DBMirror.pl is seriously screwew.
>
> [1] http://archives.postgresql.org/pgsql-bugs/2006-06/msg00065.php
I finally found some time to debug this, and I think I found a better
patch than the one you proposed. Mine is still hackish and is still a
workaround around a proper quoting solution, but at least it repairs
the parsing without introducing the \' quoting again.
I consider this a band-aid patch to fix the recent security update.
PostgreSQL gurus, would you consider applying this until a better
solution is found for DBMirror.pl?
Olivier, can you please confirm that the patch works for you, too?
Bruce Momjian [Thu, 6 Jul 2006 01:55:51 +0000 (01:55 +0000)]
Attached patch is required ot build with the CRT that comes with Visual
Studio 2005. Basically MS defined errcode in the headers with a typedef,
so we have to #define it out of the way.
While at it, fix a function declaration in plpython that didn't match
the implementation (volatile missing).
Tom Lane [Mon, 3 Jul 2006 22:45:41 +0000 (22:45 +0000)]
Code review for FILLFACTOR patch. Change WITH grammar as per earlier
discussion (including making def_arg allow reserved words), add missed
opt_definition for UNIQUE case. Put the reloptions support code in a less
random place (I chose to make a new file access/common/reloptions.c).
Eliminate header inclusion creep. Make the index options functions safely
user-callable (seems like client apps might like to be able to test validity
of options before trying to make an index). Reduce overhead for normal case
with no options by allowing rd_options to be NULL. Fix some unmaintainably
klugy code, including getting rid of Natts_pg_class_fixed at long last.
Some stylistic cleanup too, and pay attention to keeping comments in sync
with code.
Documentation still needs work, though I did fix the omissions in
catalogs.sgml and indexam.sgml.
Neil Conway [Sun, 2 Jul 2006 05:17:26 +0000 (05:17 +0000)]
Do a pass of code review for the ALTER TABLE ADD INHERITS patch. Keep
the read lock we hold on the table's parent relation until commit.
Update equalfuncs.c for the new field in AlterTableCmd. Various
improvements to comments, variable names, and error reporting.
There is room for further improvement here, but this is at least
a step in the right direction.
Bruce Momjian [Sun, 2 Jul 2006 01:59:46 +0000 (01:59 +0000)]
Done:
< o Add ALTER TABLE tab ADD/DROP INHERITS parent
<
< This allows tables to be added/removed from an inheritance
< hierarchy. This is particularly useful for table partitioning.
< http://archives.postgresql.org/pgsql-hackers/2006-05/msg00988.php
<
> o -Add ALTER TABLE tab INHERIT / NO INHERIT parent
Bruce Momjian [Sun, 2 Jul 2006 01:58:36 +0000 (01:58 +0000)]
ALTER TABLE ... ADD/DROPS INHERIT (actually INHERIT / NO INHERIT)
Open items:
There were a few tangentially related issues that have come up that I think
are TODOs. I'm likely to tackle one or two of these next so I'm interested in
hearing feedback on them as well.
. Constraints currently do not know anything about inheritance. Tom suggested
adding a coninhcount and conislocal like attributes have to track their
inheritance status.
. Foreign key constraints currently do not get copied to new children (and
therefore my code doesn't verify them). I don't think it would be hard to
add them and treat them like CHECK constraints.
. No constraints at all are copied to tables defined with LIKE. That makes it
hard to use LIKE to define new partitions. The standard defines LIKE and
specifically says it does not copy constraints. But the standard already has
an option called INCLUDING DEFAULTS; we could always define a non-standard
extension LIKE table INCLUDING CONSTRAINTS that gives the user the option to
request a copy including constraints.
. Personally, I think the whole attislocal thing is bunk. The decision about
whether to drop a column from children tables or not is something that
should be up to the user and trying to DWIM based on whether there was ever
a local definition or the column was acquired purely through inheritance is
hardly ever going to match up with user expectations.
. And of course there's the whole unique and primary key constraint issue. I
think to get any traction at all on this you have a prerequisite of a real
partitioned table implementation where the system knows what the partition
key is so it can recognize when it's a leading part of an index key.
Tom Lane [Sat, 1 Jul 2006 22:07:23 +0000 (22:07 +0000)]
Fix oversight in planning for multiple indexscans driven by
ScalarArrayOpExpr index quals: we were estimating the right total
number of rows returned, but treating the index-access part of the
cost as if a single scan were fetching that many consecutive index
tuples. Actually we should treat it as a multiple indexscan, and
if there are enough of 'em the Mackert-Lohman discount should kick in.
Tom Lane [Sat, 1 Jul 2006 18:38:33 +0000 (18:38 +0000)]
Revise the planner's handling of "pseudoconstant" WHERE clauses, that is
clauses containing no variables and no volatile functions. Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false. Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause. In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism. When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point. This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree. Per gripe from Phil Frost.
Tom Lane [Thu, 29 Jun 2006 20:00:08 +0000 (20:00 +0000)]
Remove the separate 'stats buffer' process, letting backend stats messages
be delivered directly to the collector process. The extra process context
swaps required to transfer data through the buffer process seem to outweigh
any value the buffering might have. Per recent discussion and tests.
I modified Bruce's draft patch to use poll() rather than select() where
available (this makes a noticeable difference on my system), and fixed
up the EXEC_BACKEND case.
Tom Lane [Thu, 29 Jun 2006 16:07:29 +0000 (16:07 +0000)]
Change TRUNCATE's method for searching for foreign-key references so that
the order in which it visits tables is not dependent on the physical order
of pg_constraint entries, and neither are the error messages it gives.
This should correct recently-noticed instability in regression tests.
Neil Conway [Wed, 28 Jun 2006 22:11:01 +0000 (22:11 +0000)]
Add missing #include directive: pg_constraint.h declares some functions
whose prototypes include the "List" type, so it ought to include the
list header file.
Neil Conway [Wed, 28 Jun 2006 22:01:52 +0000 (22:01 +0000)]
Editorialization for the additions to the CREATE TABLE reference page
made as part of the recent INCLUDING CONSTRAINTS patch. The text could
stand further improvement, but this is at least a step in the right
direction.
Tom Lane [Wed, 28 Jun 2006 19:40:52 +0000 (19:40 +0000)]
Fix hash aggregation to suppress unneeded columns from being stored in
tuple hash table entries. This addresses the problem previously noted
that use of a 'physical tlist' in the input scan node could bloat the
hash table entries far beyond what the planner expects. It's a better
answer than my previous thought of undoing the physical tlist optimization,
because we can also remove columns that are needed to compute the aggregate
functions but aren't part of the grouping column set.
Bruce Momjian [Wed, 28 Jun 2006 15:39:32 +0000 (15:39 +0000)]
Update:
< o Add support for WITH HOLD cursors
> o Add support for WITH HOLD and SCROLL cursors
>
> PL/pgSQL cursors should support the same syntax as
> backend cursors.
>
Teodor Sigaev [Wed, 28 Jun 2006 12:00:14 +0000 (12:00 +0000)]
Changes
* new split algorithm (as proposed in http://archives.postgresql.org/pgsql-hackers/2006-06/msg00254.php)
* possible call pickSplit() for second and below columns
* add spl_(l|r)datum_exists to GIST_SPLITVEC -
pickSplit should check its values to use already defined
spl_(l|r)datum for splitting. pickSplit should set
spl_(l|r)datum_exists to 'false' (if they was 'true') to
signal to caller about using spl_(l|r)datum.
* support for old pickSplit(): not very optimal
but correct split
* remove 'bytes' field from GISTENTRY: in any case size of
value is defined by it's type.
* split GIST_SPLITVEC to two structures: one for using in picksplit
and second - for internal use.
* some code refactoring
* support of subsplit to rtree opclasses
Tom Lane [Tue, 27 Jun 2006 18:59:17 +0000 (18:59 +0000)]
Put #ifdef NOT_USED around posix_fadvise call. We may want to resurrect
this someday, but right now it seems that posix_fadvise is immature to
the point of being broken on many platforms ... and we don't have any
benchmark evidence proving it's worth spending time on.
Tom Lane [Tue, 27 Jun 2006 16:53:02 +0000 (16:53 +0000)]
Extend the MinimalTuple concept to tuplesort.c, thereby reducing the
per-tuple space overhead for sorts in memory. I chose to replace the
previous patch that tried to write out the bare minimum amount of data
when sorting on disk; instead, just dump the MinimalTuples as-is. This
wastes 3 to 10 bytes per tuple depending on architecture and null-bitmap
length, but the simplification in the writetup/readtup routines seems
worth it.
Alvaro Herrera [Tue, 27 Jun 2006 03:45:16 +0000 (03:45 +0000)]
Clamp last_anl_tuples to n_live_tuples, in case we vacuum a table without
analyzing, so that future analyze threshold calculations don't get confused.
Also, make sure we correctly track the decrease of live tuples cause by
deletes.
Per report from Dylan Hansen, patches by Tom Lane and me.
Bruce Momjian [Tue, 27 Jun 2006 03:22:45 +0000 (03:22 +0000)]
Done:
< * %Disallow changing DEFAULT expression of a SERIAL column?
<
< This should be done only if the existing SERIAL problems cannot be
< fixed.
<
> * -Disallow changing DEFAULT expression of a SERIAL column
Tom Lane [Tue, 27 Jun 2006 02:51:40 +0000 (02:51 +0000)]
Create infrastructure for 'MinimalTuple' representation of in-memory
tuples with less header overhead than a regular HeapTuple, per my
recent proposal. Teach TupleTableSlot code how to deal with these.
As proof of concept, change tuplestore.c to store MinimalTuples instead
of HeapTuples. Future patches will expand the concept to other places
where it is useful.
Tom Lane [Mon, 26 Jun 2006 17:24:41 +0000 (17:24 +0000)]
Change the row constructor syntax (ROW(...)) so that list elements foo.*
will be expanded to a list of their member fields, rather than creating
a nested rowtype field as formerly. (The old behavior is still available
by omitting '.*'.) This syntax is not allowed by the SQL spec AFAICS,
so changing its behavior doesn't violate the spec. The new behavior is
substantially more useful since it allows, for example, triggers to check
for data changes with 'if row(new.*) is distinct from row(old.*)'. Per
my recent proposal.
Tom Lane [Sun, 25 Jun 2006 18:29:49 +0000 (18:29 +0000)]
Tweak dynahash.c to avoid wasting memory space in non-shared hash tables.
palloc() will normally round allocation requests up to the next power of 2,
so make dynahash choose allocation sizes that are as close to a power of 2
as possible.
Back-patch to 8.1 --- the problem exists further back, but a much larger
patch would be needed and it doesn't seem worth taking any risks.
Bruce Momjian [Sun, 25 Jun 2006 16:27:41 +0000 (16:27 +0000)]
Add:
< * Reuse index tuples that point to rows that are not visible to anyone?
> * Reuse index tuples that point to heap tuples that are not visible to
> anyone?
Alvaro Herrera [Sun, 25 Jun 2006 04:37:55 +0000 (04:37 +0000)]
Our version of getopt_long does not set optarg upon detecting an error, as
opposed to what other versions apparently do, so it's not safe to print an
error message. Besides, getopt_long itself already did, so it's redundant
anyway.
Bruce Momjian [Sun, 25 Jun 2006 01:45:32 +0000 (01:45 +0000)]
Remove individual user copyright because the code is contributed to
PGDG:
> Yes. In fact the copyright belongs to credativ GmbH the company that
> paid Carsten for his work. As you may or may not know I'm the CEO of
> that company and can assure you that his work was contributed to the
> PostgreSQL project.
Bruce Momjian [Sun, 25 Jun 2006 00:18:24 +0000 (00:18 +0000)]
Fix Win32/Cygwin problems:
After updating to the latest cvs, and also building most of the addons
(like PLs), the following patch is neededf for win32 + Visual C++.
* Switch to use the new win32 semaphore code
* Rename win32_open to pgwin32_open. win32_open collides with symbols
defined in Perl. MingW didn't detect ig, MSVC did. And it's a bit too
generic a name to export globally, imho...
* Python defines some partially broken #pragmas in the headers when
doing a debug build. Workaround.
Bruce Momjian [Sat, 24 Jun 2006 23:47:58 +0000 (23:47 +0000)]
Update entry:
< * Allow heap reuse of UPDATEd rows if old and new versions are on the
< same heap page?
> * Allow heap reuse of UPDATEd rows if no indexed columns are changed,
> and old and new versions are on the same heap page?
< This is possible for same-page updates because a single index row
< can point to both old and new values.
> While vacuum handles DELETEs fine, updating of non-indexed columns, like
> counters, are difficult for VACUUM to handle efficiently. This method
> is possible for same-page updates because a single index row can be
> used to point to both old and new values.
Bruce Momjian [Sat, 24 Jun 2006 23:45:02 +0000 (23:45 +0000)]
Add UPDATE entry for row reuse.
>
> * Allow heap reuse of UPDATEd rows if old and new versions are on the
> same heap page?
>
> This is possible for same-page updates because a single index row
> can point to both old and new values.
> http://archives.postgresql.org/pgsql-hackers/2006-06/msg01305.php
Tom Lane [Thu, 22 Jun 2006 20:42:57 +0000 (20:42 +0000)]
pg_stop_backup was calling XLogArchiveNotify() twice for the newly created
backup history file. Bug introduced by the 8.1 change to make pg_stop_backup
delete older history files. Per report from Masao Fujii.
Tom Lane [Wed, 21 Jun 2006 19:40:31 +0000 (19:40 +0000)]
Move setup_cancel_handler() up near start of psql main(), where the
setup_win32_locks() call formerly was, to ensure that cancelConnLock is
valid when it needs to be. Per Yoshiyuki Asaba.
Tom Lane [Wed, 21 Jun 2006 18:39:42 +0000 (18:39 +0000)]
Remove ancient kluge that kept nodeAgg.c from crashing on UPDATEs involving
aggregates. We just disallowed that, and AFAICS there should be no other
cases where direct (non-aggregated) references to input columns are allowed
in a query with aggregation and no GROUP BY.
Tom Lane [Wed, 21 Jun 2006 18:30:11 +0000 (18:30 +0000)]
Disallow aggregate functions in UPDATE commands (unless within a sub-SELECT).
This is disallowed by the SQL spec because it doesn't have any very sensible
interpretation. Historically Postgres has allowed it but behaved strangely.
As of PG 8.1 a server crash is possible if the MIN/MAX index optimization gets
applied; rather than try to "fix" that, it seems best to just enforce the
spec restriction. Per report from Josh Drake and Alvaro Herrera.
Joe Conway [Wed, 21 Jun 2006 16:43:11 +0000 (16:43 +0000)]
- During dblink_open, if transaction state was IDLE, force cursor count to
initially be 0. This is needed as a previous ABORT might have wiped out
an automatically opened transaction without maintaining the cursor count.
- Fix regression test expected file for the correct ERROR message, which
we now get given the above bug fix.
Tom Lane [Wed, 21 Jun 2006 16:05:11 +0000 (16:05 +0000)]
Clean up psql variable code a little: eliminate unnecessary tests in
GetVariable() and be consistent about treatment of the list header.
Motivated by noticing strspn() taking an unreasonable percentage of
runtime --- the call removed from GetVariable() was the only one that
could be in a high-usage path ...