Peter Eisentraut [Thu, 25 Jan 2007 11:53:52 +0000 (11:53 +0000)]
Various fixes in the logic of XML functions:
- Add new SQL command SET XML OPTION (also available via regular GUC) to
control the DOCUMENT vs. CONTENT option in implicit parsing and
serialization operations.
- Subtle corrections in the handling of the standalone property in
xmlroot().
- Allow xmlroot() to work on content fragments.
- Subtle corrections in the handling of the version property in
xmlconcat().
- Code refactoring for producing XML declarations.
Bruce Momjian [Thu, 25 Jan 2007 02:50:12 +0000 (02:50 +0000)]
Remove developers list from TODO list now that we have URLs to reference
discussions.
<
<
< ---------------------------------------------------------------------------
<
<
< Developers who have claimed items are:
< --------------------------------------
< * Alvaro is Alvaro Herrera <alvherre@dcc.uchile.cl>
< * Andrew is Andrew Dunstan <andrew@dunslane.net>
< * Bruce is Bruce Momjian <bruce@momjian.us> of EnterpriseDB
< * Christopher is Christopher Kings-Lynne <chriskl@familyhealth.com.au> of
< Family Health Network
< * D'Arcy is D'Arcy J.M. Cain <darcy@druid.net> of The Cain Gang Ltd.
< * David is David Fetter <david@fetter.org>
< * Fabien is Fabien Coelho <coelho@cri.ensmp.fr>
< * Gavin is Gavin Sherry <swm@linuxworld.com.au> of Alcove Systems Engineering
< * Greg is Greg Sabino Mullane <greg@turnstep.com>
< * Jan is Jan Wieck <JanWieck@Yahoo.com> of Afilias, Inc.
< * Joe is Joe Conway <mail@joeconway.com>
< * Karel is Karel Zak <zakkr@zf.jcu.cz>
< * Magnus is Magnus Hagander <mha@sollentuna.net>
< * Marc is Marc Fournier <scrappy@hub.org> of PostgreSQL, Inc.
< * Matthew T. O'Connor <matthew@zeut.net>
< * Michael is Michael Meskes <meskes@postgresql.org> of Credativ
< * Neil is Neil Conway <neilc@samurai.com>
< * Oleg is Oleg Bartunov <oleg@sai.msu.su>
< * Pavel is Pavel Stehule <pavel.stehule@hotmail.com>
< * Peter is Peter Eisentraut <peter_e@gmx.net>
< * Philip is Philip Warner <pjw@rhyme.com.au> of Albatross Consulting Pty. Ltd.
< * Rod is Rod Taylor <pg@rbt.ca>
< * Simon is Simon Riggs <simon@2ndquadrant.com>
< * Stephan is Stephan Szabo <sszabo@megazone23.bigpanda.com>
< * Tatsuo is Tatsuo Ishii <ishii@sraoss.co.jp> of SRA OSS, Inc. Japan
< * Teodor is Teodor Sigaev <teodor@sigaev.ru>
< * Tom is Tom Lane <tgl@sss.pgh.pa.us> of Red Hat
Tom Lane [Wed, 24 Jan 2007 17:12:17 +0000 (17:12 +0000)]
Get pg_utf_mblen(), pg_utf2wchar_with_len(), and utf2ucs() all on the same
page about the maximum UTF8 sequence length we support (4 bytes since 8.1,
3 before that). pg_utf2wchar_with_len never got updated to support 4-byte
characters at all, and in any case had a buffer-overrun risk in that it
could produce multiple pg_wchars from what mblen claims to be just one UTF8
character. The only reason we don't have a major security hole is that most
callers allocate worst-case output buffers; the sole exception in released
versions appears to be pre-8.2 iwchareq() (ie, ILIKE), which can be crashed
due to zeroing out its return address --- but AFAICS that can't be exploited
for anything more than a crash, due to inability to control what gets written
there. Per report from James Russell and Michael Fuhr.
Pre-8.1 the risk is much less, but I still think pg_utf2wchar_with_len's
behavior given an incomplete final character risks buffer overrun, so
back-patch that logic change anyway.
This patch also makes sure that UTF8 sequences exceeding the supported
length (whichever it is) are consistently treated as error cases, rather
than being treated like a valid shorter sequence in some places.
Tom Lane [Wed, 24 Jan 2007 01:25:47 +0000 (01:25 +0000)]
Relax an Assert() that has been found to be too strict in some situations
involving unions of types having typmods. Variants of the failure are known
to occur in 8.1 and up; not sure if it's possible in 8.0 and 7.4, but since
the code exists that far back, I'll just patch 'em all. Per report from
Brian Hurt.
Tom Lane [Tue, 23 Jan 2007 05:07:18 +0000 (05:07 +0000)]
Add CREATE/ALTER/DROP OPERATOR FAMILY commands, also COMMENT ON OPERATOR
FAMILY; and add FAMILY option to CREATE OPERATOR CLASS to allow adding a
class to a pre-existing family. Per previous discussion. Man, what a
tedious lot of cutting and pasting ...
Tom Lane [Mon, 22 Jan 2007 20:00:40 +0000 (20:00 +0000)]
Put back planner's ability to cache the results of mergejoinscansel(),
which I had removed in the first cut of the EquivalenceClass rewrite to
simplify that patch a little. But it's still important --- in a four-way
join problem mergejoinscansel() was eating about 40% of the planning time
according to gprof. Also, improve the EquivalenceClass code to re-use
join RestrictInfos rather than generating fresh ones for each join
considered. This saves some memory space but more importantly improves
the effectiveness of caching planning info in RestrictInfos.
Tom Lane [Mon, 22 Jan 2007 02:17:30 +0000 (02:17 +0000)]
Adjust pgbench so it won't spit up on non-select queries returning
tuples, which is entirely possible with custom scripts (consider
RETURNING, EXPLAIN, etc).
Tom Lane [Mon, 22 Jan 2007 01:35:23 +0000 (01:35 +0000)]
Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function. We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first. In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
Tom Lane [Sat, 20 Jan 2007 23:13:01 +0000 (23:13 +0000)]
Simplify pg_am representation of ordering-capable access methods:
provide just a boolean 'amcanorder', instead of fields that specify the
sort operator strategy numbers. We have decided to require ordering-capable
AMs to use btree-compatible strategy numbers, so the old fields are
overkill (and indeed misleading about what's allowed).
Bruce Momjian [Sat, 20 Jan 2007 21:30:26 +0000 (21:30 +0000)]
Update documentation about postgresqlconf to mention default units that
match the postgresql.conf file. Also add units to descriptions that
lacked them. Wording improvements. Mention pg_settings.unit as the way
to find the default units for setting.
Neil Conway [Sat, 20 Jan 2007 21:17:30 +0000 (21:17 +0000)]
List disabled triggers separately in psql's "\d <table>" output.
Previously, disabled triggers were not displayed any differently than
enabled ones, which was quite misleading. Patch from Brendan Jurd.
Tom Lane [Sat, 20 Jan 2007 20:45:41 +0000 (20:45 +0000)]
Refactor planner's pathkeys data structure to create a separate, explicit
representation of equivalence classes of variables. This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
Neil Conway [Sat, 20 Jan 2007 18:43:35 +0000 (18:43 +0000)]
Refactor the index AM API slightly: move currentItemData and
currentMarkData from IndexScanDesc to the opaque structs for the
AMs that need this information (currently gist and hash).
Patch from Heikki Linnakangas, fixes by Neil Conway.
Peter Eisentraut [Sat, 20 Jan 2007 15:26:28 +0000 (15:26 +0000)]
The libpq library directory was mentioned here in the wrong place, which
might lead to a previously installed libpq being used instead. But we
don't actually have to link with libpq here at all, so remove it.
Bruce Momjian [Fri, 19 Jan 2007 21:36:07 +0000 (21:36 +0000)]
Add items:
> o Allow multiple vacuums so large tables do not starve small
> tables
>
> http://archives.postgresql.org/pgsql-general/2007-01/msg00031.php
>
> o Improve control of auto-vacuum
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00876.php
Peter Eisentraut [Fri, 19 Jan 2007 16:58:46 +0000 (16:58 +0000)]
Add support for converting binary values (i.e. bytea) into xml values,
with new GUC parameter "xmlbinary" that controls the output encoding, as
per SQL/XML standard.
Alvaro Herrera [Fri, 19 Jan 2007 16:42:24 +0000 (16:42 +0000)]
Change the sed rules in the regression test for pg_regress hackery to create
the generated files, to help Visual C++ to run these tests. The tests still
pass in VPATH and normal builds.
Peter Eisentraut [Thu, 18 Jan 2007 13:59:11 +0000 (13:59 +0000)]
Clean up encoding issues in the xml type: In text mode, encoding
declarations are ignored and removed, in binary mode they are honored as
specified by the XML standard.
Neil Conway [Wed, 17 Jan 2007 16:19:08 +0000 (16:19 +0000)]
Tweak the width_bucket() regression tests to avoid an unnecessary
dependency on the platform's floating point implementation. Per
report from Stefan Kaltenbrunner.
Tom Lane [Wed, 17 Jan 2007 00:17:21 +0000 (00:17 +0000)]
Revise bgwriter fsync-request mechanism to improve robustness when a table
is deleted. A backend about to unlink a file now sends a "revoke fsync"
request to the bgwriter to make it clean out pending fsync requests. There
is still a race condition where the bgwriter may try to fsync after the unlink
has happened, but we can resolve that by rechecking the fsync request queue
to see if a revoke request arrived meanwhile. This eliminates the former
kluge of "just assuming" that an ENOENT failure is okay, and lets us handle
the fact that on Windows it might be EACCES too without introducing any
questionable assumptions. After an idea of mine improved by Magnus.
The HEAD patch doesn't apply cleanly to 8.2, but I'll see about a back-port
later. In the meantime this could do with some testing on Windows; I've been
able to force it through the code path via ENOENT, but that doesn't prove that
it actually fixes the Windows problem ...
Neil Conway [Tue, 16 Jan 2007 21:43:19 +0000 (21:43 +0000)]
vcbuild updates from Magnus:
* After Markos patch, now builds pgcrypto without zlib again
* Updates README with xml info
* xml requires xslt and iconv
* disable unnecessary warning about __cdecl()
* Add a buildenv.bat called from all other bat files to set up things
like PATH for flex/bison. (Can't just set it before calling, doesn't
always work when building from the GUI)
AFAICS SQL:2003 does not define a NaN value, so it doesn't address how
width_bucket() should behave here. The patch changes width_bucket() so
that ereport(ERROR) is raised if NaN is specified for the operand or the
lower or upper bounds to width_bucket(). For float8, NaN is disallowed
for any of the floating-point inputs, and +/- infinity is disallowed
for the histogram bounds (but allowed for the operand).
Update docs and regression tests, bump the catversion.
Tom Lane [Tue, 16 Jan 2007 18:32:26 +0000 (18:32 +0000)]
Fix incorrect permissions check in information_schema.key_column_usage view:
it was checking a pg_constraint OID instead of pg_class OID, resulting in
"relation with OID nnnnn does not exist" failures for anyone who wasn't
owner of the table being examined. Per bug #2848 from Laurence Rowe.
Note: for existing 8.2 installations a simple version update won't fix this;
the easiest fix is to CREATE OR REPLACE this view with the corrected
definition.
Alvaro Herrera [Tue, 16 Jan 2007 13:28:57 +0000 (13:28 +0000)]
Arrange for autovacuum to be killed when another operation wants to be alone
accessing it, like DROP DATABASE. This allows the regression tests to pass
with autovacuum enabled, which open the gates for finally enabling autovacuum
by default.
Neil Conway [Sun, 14 Jan 2007 22:37:59 +0000 (22:37 +0000)]
Add a note to the docs describing NaN's equality and ordering behavior.
Per recent -hackers thread, this is noteworthy because Postgres behaves
differently from most implementations of NaN, including IEEE754.
Bruce Momjian [Sat, 13 Jan 2007 15:13:44 +0000 (15:13 +0000)]
Remove completed items, and the last is unwanted:
< o Fix memory leak from exceptions
<
< http://archives.postgresql.org/pgsql-performance/2006-06/msg00305.php
<
< * Allow constraint_exclusion to work for UNIONs like it does for
< inheritance, allow it to work for UPDATE and DELETE statements, and allow
< it to be used for all statements with little performance impact
<
< * Add estimated_count(*) to return an estimate of COUNT(*)
<
< This would use the planner ANALYZE statistics to return an estimated
< count.
< http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php
Tom Lane [Fri, 12 Jan 2007 23:34:55 +0000 (23:34 +0000)]
Fix handling of CC (century) format spec in to_date/to_char. According to
standard convention the 21st century runs from 2001-2100, not 2000-2099,
so make it work like that. Per bug #2885 from Akio Iwaasa.
Backpatch to 8.2, but no further, since this is really a definitional
change; users of older branches are probably more interested in stability.
Tom Lane [Fri, 12 Jan 2007 17:04:54 +0000 (17:04 +0000)]
Add some notes about the basic mathematical laws that the system presumes
hold true for operators in a btree operator family. This is mostly to
clarify my own thinking about what the planner can assume for optimization
purposes. (blowing dust off an old abstract-algebra textbook...)
Peter Eisentraut [Fri, 12 Jan 2007 16:29:24 +0000 (16:29 +0000)]
Allow for arbitrary data types as content in XMLELEMENT. The original
coercion to type xml was a mistake. Escape values so they are valid
XML character data.
Tom Lane [Thu, 11 Jan 2007 23:06:03 +0000 (23:06 +0000)]
Fix a performance problem in databases with large numbers of tables
(or other types of pg_class entry): the function pgstat_vacuum_tabstat,
invoked during VACUUM startup, had runtime proportional to the number of
stats table entries times the number of pg_class rows; in other words
O(N^2) if the stats collector's information is reasonably complete.
Replace list searching with a hash table to bring it back to O(N)
behavior. Per report from kim at myemma.com.
Back-patch as far as 8.1; 8.0 and before use different coding here.
Michael Meskes [Thu, 11 Jan 2007 15:47:34 +0000 (15:47 +0000)]
Applied Joachim's patch for a --regression option.
Made this option mark the .c files, so the environment variable is no longer needed.
Created a special MinGW file with the special error message.
Do not print port into log file when running regression tests.