Tom Lane [Mon, 1 Oct 2007 16:25:56 +0000 (16:25 +0000)]
Avoid assuming that struct varattrib_pointer doesn't get padded by the
compiler --- at least on ARM, it does. I suspect that the varvarlena patch
has been creating larger-than-intended toast pointers all along on ARM,
but it wasn't exposed until the latest tweak added some Asserts that
calculated the expected size in a different way. We could probably have
fixed this by adding __attribute__((packed)) as is done for ItemPointerData,
but struct varattrib_pointer isn't really all that useful anyway, so it
seems cleanest to just get rid of it and have only struct varattrib_1b_e.
Per results from buildfarm member quagga.
Tom Lane [Sun, 30 Sep 2007 19:54:58 +0000 (19:54 +0000)]
Add an extra header byte to TOAST-pointer datums to represent their size
explicitly. This means a TOAST pointer takes 18 bytes instead of 17 --- still
smaller than in 8.2 --- which seems a good tradeoff to ensure we won't have
painted ourselves into a corner if we want to support multiple types of TOAST
pointer later on. Per discussion with Greg Stark.
Tom Lane [Sun, 30 Sep 2007 17:28:56 +0000 (17:28 +0000)]
Adjust recovery PS display as agreed with Simon: 'waiting for XXX'
while the restore_command does its thing, then 'recovering XXX' while
processing the segment file. These operations are heavyweight enough
that an extra PS display set shouldn't bother anyone.
Tom Lane [Sun, 30 Sep 2007 17:13:19 +0000 (17:13 +0000)]
Properly mark mergeable/hashable equality operators (found by opr_sanity
testing). Combine the formerly independent opclasses for the various
ISN types into opfamilies. The latter causes some extra bleating from
opr_sanity, since the module doesn't provide complete sets of cross-type
operators, but it's still a good idea because it will give the planner
more information to work with. The missing cross-type operators no longer
pose a risk of unexpected planner errors in 8.3, so there's no need to
insist on filling them in (and I gather it wouldn't be very sound
semantically to add them all).
Tom Lane [Sat, 29 Sep 2007 23:32:42 +0000 (23:32 +0000)]
Remove bogus commutator marking --- the module doesn't actually supply
any commutator operator for =(chkpass,text), so this was creating a
shell operator that would fail on use. Found by opr_sanity testing.
Tom Lane [Sat, 29 Sep 2007 18:05:20 +0000 (18:05 +0000)]
Disallow CLUSTER using an invalid index (that is, one left over from a failed
CREATE INDEX CONCURRENTLY). Such an index might not have entries for every
heap row and thus clustering with it would result in silent data loss.
The scenario requires a pretty foolish DBA, but still ...
Tom Lane [Sat, 29 Sep 2007 17:18:58 +0000 (17:18 +0000)]
Improve consistency of the error messages generated when you try to use
ALTER TABLE on a composite type or ALTER TYPE on a table's rowtype.
We already rejected these cases, but the error messages were a bit
random and didn't always provide a HINT to use the other command type.
Tom Lane [Sat, 29 Sep 2007 01:36:10 +0000 (01:36 +0000)]
Make archive recovery always start a new timeline, rather than only when a
recovery stop time was used. This avoids a corner-case risk of trying to
overwrite an existing archived copy of the last WAL segment, and seems
simpler and cleaner all around than the original definition. Per example
from Jon Colverson and subsequent analysis by Simon.
Tom Lane [Sat, 29 Sep 2007 00:01:43 +0000 (00:01 +0000)]
Teach chklocale.c about a few names for frontend-only encodings,
since this will allow initdb to reject attempts to initdb in a locale
that uses such an encoding. We'll probably find out more such names
during beta ...
Tom Lane [Fri, 28 Sep 2007 22:25:49 +0000 (22:25 +0000)]
Change initdb and CREATE DATABASE to actively reject attempts to create
databases with encodings that are incompatible with the server's LC_CTYPE
locale, when we can determine that (which we can on most modern platforms,
I believe). C/POSIX locale is compatible with all encodings, of course,
so there is still some usefulness to CREATE DATABASE's ENCODING option,
but this will insulate us against all sorts of recurring complaints
caused by mismatched settings.
I moved initdb's existing LC_CTYPE-to-encoding mapping knowledge into
a new src/port/ file so it could be shared by CREATE DATABASE.
Tom Lane [Thu, 27 Sep 2007 20:39:43 +0000 (20:39 +0000)]
Tweak pgbench.c to remove the hidden assumption that a WIN32 machine
couldn't possibly HAVE_GETOPT. I believe this is the most appropriate
form of the patch submitted 2007-08-07 by Hiroshi Saito, though not
having a Windows build environment I won't know for sure till I see
the buildfarm results.
Tom Lane [Thu, 27 Sep 2007 19:53:44 +0000 (19:53 +0000)]
Define the FRONTEND symbol in postgres_fe.h, which allows us to eliminate
duplicative -DFRONTEND flags from many Makefiles. We still need Makefile
control of the symbol in a few places that compile frontend-or-backend
src/port/ files, but it's a lot cleaner than before.
Tom Lane [Thu, 27 Sep 2007 18:15:36 +0000 (18:15 +0000)]
Add virtual transaction IDs to CSVLOG output, so that messages coming from
the same transaction can be identified even when no regular XID was assigned.
This seems essential after addition of the lazy-XID patch. Also some
minor code cleanup in write_csvlog().
Tom Lane [Thu, 27 Sep 2007 17:42:03 +0000 (17:42 +0000)]
Fix Assert failure in ExpandColumnRefStar --- what I thought was a can't
happen condition can happen given incorrect input. The real problem is that
gram.y should try harder to distinguish * from "*" --- the latter is a legal
column name per spec, and someday we ought to treat it that way. However
fixing that is too invasive for a back-patch, and it's too late for the 8.3
cycle too. So just reduce the Assert to a plain elog for now. Per report
from NikhilS.
Tom Lane [Wed, 26 Sep 2007 23:29:10 +0000 (23:29 +0000)]
Some small tuptoaster improvements from Greg Stark. Avoid unnecessary
decompression of an already-compressed external value when we have to copy
it; save a few cycles when a value is too short for compression; and
annotate various lines that are currently unreachable.
Adjust the new memory limit in the lazy vacuum code to use MaxHeapTuplesPerPage
tuples per page instead of fixed 200, to better cope with systems that use a
different block size.
Tom Lane [Wed, 26 Sep 2007 18:51:51 +0000 (18:51 +0000)]
Create a function variable "join_search_hook" to let plugins override the
join search order portion of the planner; this is specifically intended to
simplify developing a replacement for GEQO planning. Patch by Julius
Stroffek, editorialized on by me. I renamed make_one_rel_by_joins to
standard_join_search and make_rels_by_joins to join_search_one_level to better
reflect their place within this scheme.
Tom Lane [Wed, 26 Sep 2007 01:10:42 +0000 (01:10 +0000)]
In the integer-datetimes case, date2timestamp and date2timestamptz need
to check for overflow because the legal range of type date is actually
wider than timestamp's. Problem found by Neil Conway.
Tom Lane [Wed, 26 Sep 2007 00:32:46 +0000 (00:32 +0000)]
Use SYSV semaphores rather than POSIX on Darwin >= 6.0 (i.e., OS X 10.2
and up), per Chris Marcellino. This avoids consuming O(N^2) file
descriptors to support N backends. Tests suggest it's about a wash for
small installations, but large ones would have a problem.
Tom Lane [Tue, 25 Sep 2007 22:21:55 +0000 (22:21 +0000)]
Change on-disk representation of NUMERIC datatype so that the sign_dscale
word comes before the weight instead of after. This will allow future
binary-compatible extension of the representation to support compact formats,
as discussed on pgsql-hackers around 2007/06/18. The reason to do it now is
that we've already pretty well broken any chance of simple in-place upgrade
from 8.2 to 8.3, but it's possible that 8.3 to 8.4 (or whenever we get around
to squeezing NUMERIC) could otherwise be data-compatible.
Tom Lane [Tue, 25 Sep 2007 22:11:48 +0000 (22:11 +0000)]
Dept. of second thoughts: fix loop in BgBufferSync so that the exit when
bgwriter_lru_maxpages is exceeded leaves the loop variables in the
expected state. In the original coding, we'd fail to advance
next_to_clean, causing that buffer to be probably-uselessly rechecked next
time, and also have an off-by-one idea of the number of buffers scanned.
Tom Lane [Tue, 25 Sep 2007 20:03:38 +0000 (20:03 +0000)]
Just-in-time background writing strategy. This code avoids re-scanning
buffers that cannot possibly need to be cleaned, and estimates how many
buffers it should try to clean based on moving averages of recent allocation
requests and density of reusable buffers. The patch also adds a couple
more columns to pg_stat_bgwriter to help measure the effectiveness of the
bgwriter.
Greg Smith, building on his own work and ideas from several other people,
in particular a much older patch from Itagaki Takahiro.
Avoid having autovacuum read pgstats data too many times in quick succession.
This is problematic for the autovac launcher when there are many databases,
so we keep data for a full second before reading it again.
Reduce the size of memory allocations by lazy vacuum when processing a small
table, by allocating just enough for a hardcoded number of dead tuples per
page. The current estimate is 200 dead tuples per page.
Per reports from Jeff Amiel, Erik Jones and Marko Kreen, and subsequent
discussion.
CVS: ----------------------------------------------------------------------
CVS: Enter Log. Lines beginning with `CVS:' are removed automatically
CVS:
CVS: Committing in .
CVS:
CVS: Modified Files:
CVS: commands/vacuumlazy.c
CVS: ----------------------------------------------------------------------
Andrew Dunstan [Mon, 24 Sep 2007 01:29:30 +0000 (01:29 +0000)]
Remove "convert 'blah' using conversion_name" facility, because if it
produces text it is an encoding hole and if not it's incompatible
with the spec, whatever the spec means (which we're not sure about anyway).
Andrew Dunstan [Sun, 23 Sep 2007 21:52:56 +0000 (21:52 +0000)]
Add perl replacements for build.bat and vcregress.bat. In due course
the .bat files will be altered to become tiny wrappers for these scripts,
and one or two other .bat files will disappear.
Tom Lane [Sun, 23 Sep 2007 18:50:38 +0000 (18:50 +0000)]
TransactionIdIsInProgress can skip scanning the ProcArray if the target XID is
later than latestCompletedXid, per Florian Pflug. Also some minor
improvements in the XIDCACHE_DEBUG code --- make sure each call of
TransactionIdIsInProgress is counted one way or another.
Tom Lane [Sun, 23 Sep 2007 15:58:58 +0000 (15:58 +0000)]
Temporarily modify tsearch regression tests to suppress notice that comes
out at erratic times, because it is creating a totally unacceptable level
of noise in our buildfarm results. This patch can be reverted when and if
the code is fixed to not issue notices during cache reload events.
Tom Lane [Sat, 22 Sep 2007 21:36:40 +0000 (21:36 +0000)]
Fix cost estimates for EXISTS subqueries that are evaluated as initPlans
(because they are uncorrelated with the immediate parent query). We were
charging the full run cost to the parent node, disregarding the fact that
only one row need be fetched for EXISTS. While this would only be a
cosmetic issue in most cases, it might possibly affect planning outcomes
if the parent query were itself a subquery to some upper query.
Per recent discussion with Steve Crawford.
Andrew Dunstan [Sat, 22 Sep 2007 20:38:10 +0000 (20:38 +0000)]
Replace calls to external dir program with perlish globs and File::Find
calls. Fixes complaint fron Hannes Eder, whose environment found a different
dir program.
Tom Lane [Sat, 22 Sep 2007 19:10:44 +0000 (19:10 +0000)]
Document the translations from Postgres message severity levels to
syslog and eventlog severity levels, per suggestion from Josh Drake.
Also, some wordsmithing for the csvlog documentation.
Tom Lane [Sat, 22 Sep 2007 18:19:18 +0000 (18:19 +0000)]
Fix erroneous Assert() in syslogger process start in EXEC_BACKEND case,
per ITAGAKI Takahiro. Also, rewrite syslogger_forkexec() in hopes of
eliminating the confusion in the first place.
Tom Lane [Sat, 22 Sep 2007 04:40:03 +0000 (04:40 +0000)]
Although I'd misdiagnosed the reason for the recent failures on
buildfarm member grebe, I see no reason to revert the 1-byte-header-friendly
changes I made in varlena.c. Instead, tweak the code a little bit to
get more advantage out of that.
Andrew Dunstan [Sat, 22 Sep 2007 03:58:34 +0000 (03:58 +0000)]
Go back to using a separate method for doing ILIKE for single byte
character encodings that doesn't involve calling lower(). This should
cure the performance regression in this case complained of by Guillaume
Smet. It still leaves the horrid performance for multi-byte encodings
introduced in 8.2, but there's no obvious solution for that in sight.
Tom Lane [Sat, 22 Sep 2007 00:36:38 +0000 (00:36 +0000)]
Fix varlena.c routines to allow 1-byte-header text values. This is now
demonstrably necessary for text_substring() since regexp_split functions
may pass it such a value; and we might as well convert the whole file
at once. Per buildfarm results (though I wonder why most machines aren't
showing a failure).
Tom Lane [Fri, 21 Sep 2007 22:52:52 +0000 (22:52 +0000)]
Fix regex, LIKE, and some other second-rank text-manipulation functions
to not cause needless copying of text datums that have 1-byte headers.
Greg Stark, in response to performance gripe from Guillaume Smet and
ITAGAKI Takahiro.
Tom Lane [Fri, 21 Sep 2007 21:25:42 +0000 (21:25 +0000)]
Improve handling of prune/no-prune decisions by storing a page's oldest
unpruned XMAX in its header. At the cost of 4 bytes per page, this keeps us
from performing heap_page_prune when there's no chance of pruning anything.
Seems to be necessary per Heikki's preliminary performance testing.
Tom Lane [Fri, 21 Sep 2007 18:24:28 +0000 (18:24 +0000)]
Change tqual.c tests to use !TransactionIdIsCurrentTransactionId, rather than
TransactionIdDidAbort, when handling the case that xmin is one of the current
transaction's XIDs and the tuple has been deleted. xmax must also be one of
the current transaction's XIDs, since no one else can see it yet, and it's
cheaper to look at local state than shared state to find out if xmax aborted.
Per an idea of Heikki's.
Tom Lane [Fri, 21 Sep 2007 17:36:53 +0000 (17:36 +0000)]
Make some simple performance improvements in TransactionIdIsInProgress().
For XIDs of our own transaction and subtransactions, it's cheaper to ask
TransactionIdIsCurrentTransactionId() than to look in shared memory.
Also, the xids[] work array is always the same size within any given
process, so malloc it just once instead of doing a palloc/pfree on every
call; aside from being faster this lets us get rid of some goto's, since
we no longer have any end-of-function pfree to do. Both ideas by Heikki.
Tom Lane [Fri, 21 Sep 2007 00:30:49 +0000 (00:30 +0000)]
Insert a hack in pl/tcl to disable Tcl's built-in Notifier subsystem, which
has a bad habit of launching multiple threads within the backend and thereby
causing all kinds of havoc. Fortunately, we don't need it, and recent Tcl
versions provide an easy way to disable it. Diagnosis and fix by
Steve Marshall, Paul Bayer, and Doug Knight of WSI Corporation.
Bruce Momjian [Thu, 20 Sep 2007 18:54:19 +0000 (18:54 +0000)]
Done:
> * -Consider shrinking expired tuples to just their headers
> * -Allow heap reuse of UPDATEd rows if no indexed columns are changed,
> and old and new versions are on the same heap page
Not needed anymore:
< * Reuse index tuples that point to heap tuples that are not visible to
< anyone?
Tom Lane [Thu, 20 Sep 2007 17:56:33 +0000 (17:56 +0000)]
HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version. Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.
In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.
Pavan Deolasee, with help from a bunch of other people.
Neil Conway [Wed, 19 Sep 2007 22:31:48 +0000 (22:31 +0000)]
Prevent corr() from returning the wrong results for negative correlation
values. The previous coding essentially assumed that x = sqrt(x*x), which
does not hold for x < 0.
Thanks to Jie Zhang at Greenplum and Gavin Sherry for reporting this
issue.