Tom Lane [Thu, 8 Dec 2005 19:19:22 +0000 (19:19 +0000)]
Fix bgwriter's failure to release buffer pins and open files after an
error. This probably explains bug #2099 and could also account for
mysterious VACUUM hangups.
This is used by winsock2.h. However, Construction of a windows base is
winsock.h.
Then, Since MinGW has special environment, this is right. but, it is not
found in VC6.
Furthermore, in getaddrinfo.c, IPV6-API is used by
LoadLibraryA("ws2_32");
Referring to of dll the external memory generates this violation by VC6
specification.
I considered whether the whole should have been converted into winsock2.
However, Now, DLL of MinGW creation operates wonderfully as it is.
That's right, it has pliability by replacement of simple DLL.
Then, I propose the system using winsock(non IPV6) in construction of
VC6.
Tom Lane [Wed, 7 Dec 2005 19:37:53 +0000 (19:37 +0000)]
Push the responsibility for handling ignore_killed_tuples down into
_bt_checkkeys(), instead of checking it in the top-level nbtree.c routines
as formerly. This saves a little bit of loop overhead, but more importantly
it lets us skip performing the index key comparisons for dead tuples.
Tom Lane [Wed, 7 Dec 2005 18:03:48 +0000 (18:03 +0000)]
A couple of tiny performance hacks in _bt_step(). Remove PageIsEmpty
checks, which were once needed because PageGetMaxOffsetNumber would
fail on empty pages, but are now just redundant. Also, don't set up
local variables that aren't needed in the fast path --- most of the
time, we only need to advance offnum and not step across a page boundary.
Motivated by noticing _bt_step at the top of OProfile profile for a
pgbench run.
Tom Lane [Tue, 6 Dec 2005 23:08:34 +0000 (23:08 +0000)]
Get rid of slru.c's hardwired insistence on a fixed number of slots per
SLRU area. The number of slots is still a compile-time constant (someday
we might want to change that), but at least it's a different constant for
each SLRU area. Increase number of subtrans buffers to 32 based on
experimentation with a heavily subtrans-bashing test case, and increase
number of multixact member buffers to 16, since it's obviously silly for
it not to be at least twice the number of multixact offset buffers.
Bruce Momjian [Tue, 6 Dec 2005 18:43:26 +0000 (18:43 +0000)]
Since my name has a non-ascii-letter in it, it's often spelled wrong. In
the latest release notes there is a latin1 character that shouldn't be
there so I made a patch to fix that. This patch also fixes some old
entries that uses o instead of ö (which is also wrong but not as
bad as including a latin1 character in the sgml file).
Tom Lane [Tue, 6 Dec 2005 18:10:06 +0000 (18:10 +0000)]
Arrange for read-only accesses to SLRU page buffers to take only a shared
lock, not exclusive, if the desired page is already in memory. This can
be demonstrated to be a significant win on the pg_subtrans cache when there
is a large window of open transactions. It should be useful for pg_clog
as well. I didn't try to make GetMultiXactIdMembers() use the code, as
that would have taken some restructuring, and what with the local cache
for multixact contents it probably wouldn't really make a difference.
Per my recent proposal.
Tom Lane [Tue, 6 Dec 2005 16:50:36 +0000 (16:50 +0000)]
In a nestloop inner indexscan, it's OK to use pushed-down baserestrictinfo
clauses even if it's an outer join. This is a corner case since such
clauses could only arise from weird OUTER JOIN ON conditions, but worth
fixing. Per example from Ron at cheapcomplexdevices.com.
Tom Lane [Tue, 6 Dec 2005 02:29:04 +0000 (02:29 +0000)]
Make Win32 build use our port/snprintf.c routines, instead of depending
on libintl which may or may not provide what we need. Make a few marginal
cleanups to ensure this works. Andrew Dunstan and Tom Lane.
Tom Lane [Mon, 5 Dec 2005 02:39:38 +0000 (02:39 +0000)]
Fix a rather sizable number of problems in our homegrown snprintf, such as
incorrect implementation of argument reordering, arbitrary limit of output
size for sprintf and fprintf, willingness to access more bytes than "%.Ns"
specification allows, wrong formatting of LONGLONG_MIN, various field-padding
bugs and omissions. I believe it now accurately implements a subset of
the Single Unix Spec requirements (remaining unimplemented features are
documented, too). Bruce Momjian and Tom Lane.
Bruce Momjian [Sun, 4 Dec 2005 21:16:51 +0000 (21:16 +0000)]
Update:
< Win32 API, and we have to make sure MinGW handles it.
> Win32 API, and we have to make sure MinGW handles it. Another
> option is to wait for the MinGW project to fix it, or use the
> code from the LibGW32C project as a guide.
Bruce Momjian [Sun, 4 Dec 2005 04:33:18 +0000 (04:33 +0000)]
Add:
> o Add long file support for binary pg_dump output
>
> While Win32 supports 64-bit files, the MinGW API does not,
> meaning we have to build an fseeko replacement on top of the
> Win32 API, and we have to make sure MinGW handles it.
Tom Lane [Sat, 3 Dec 2005 21:06:18 +0000 (21:06 +0000)]
Treat procedural languages as owned by the bootstrap superuser, rather
than owned by nobody. This results in cleaner display of language ACLs,
since the backend's aclchk.c uses the same convention. AFAICS there is
no practical difference but it's nice to avoid emitting SET SESSION
AUTHORIZATION; also this will make it easier to transition pg_dump to
some future version in which we may include an explicit ownership column
in pg_language. Per gripe from David Begley.
Bruce Momjian [Sat, 3 Dec 2005 16:45:06 +0000 (16:45 +0000)]
Allow to_char(interval) and to_char(time) to use AM/PM specifications.
Map them to a single day, so '30 hours' is 'AM'.
Have to_char(interval) and to_char(time) use "HH", "HH12" as 12-hour
intervals, rather than bypass and print the full interval hours. This
is neeeded because to_char(time) is mapped to interval in this function.
Intervals should use "HH24", and document suggestion.
Tom Lane [Sat, 3 Dec 2005 05:51:03 +0000 (05:51 +0000)]
Tweak indexscan machinery to avoid taking an AccessShareLock on an index
if we already have a stronger lock due to the index's table being the
update target table of the query. Same optimization I applied earlier
at the table level. There doesn't seem to be much interest in the more
radical idea of not locking indexes at all, so do what we can ...
Tom Lane [Fri, 2 Dec 2005 20:03:42 +0000 (20:03 +0000)]
Adjust scan plan nodes to avoid getting an extra AccessShareLock on a
relation if it's already been locked by execMain.c as either a result
relation or a FOR UPDATE/SHARE relation. This avoids an extra trip to
the shared lock manager state. Per my suggestion yesterday.
Bruce Momjian [Fri, 2 Dec 2005 04:28:19 +0000 (04:28 +0000)]
Add calcluation of bitmap storage capacity.
< be cleared when a heap tuple is expired. Another idea is to maintain
< a bitmap of heap pages where all rows are visible to all backends,
< and allow index lookups to reference that bitmap to avoid heap
< lookups, perhaps the same bitmap we might add someday to determine
< which heap pages need vacuuming.
> be cleared when a heap tuple is expired.
>
> Another idea is to maintain a bitmap of heap pages where all rows
> are visible to all backends, and allow index lookups to reference
> that bitmap to avoid heap lookups, perhaps the same bitmap we might
> add someday to determine which heap pages need vacuuming. Frequently
> accessed bitmaps would have to be stored in shared memory. One 8k
> page of bitmaps could track 512MB of heap pages.
Tom Lane [Fri, 2 Dec 2005 01:29:55 +0000 (01:29 +0000)]
Rearrange code in ExecInitBitmapHeapScan so that we don't initialize the
child plan nodes until we have acquired lock on the relation to scan.
The relative order of initialization of plan nodes isn't real important in
other cases, but it's critical here because one is supposed to lock a
relation before its indexes, not vice versa. The original coding was at
least vulnerable to deadlock against DROP INDEX, and perhaps worse things.
Bruce Momjian [Thu, 1 Dec 2005 22:30:43 +0000 (22:30 +0000)]
Add all heap page rows visible bitmap idea:
< the heap. One way to allow this is to set a bit to index tuples
> the heap. One way to allow this is to set a bit on index tuples
< be cleared when a heap tuple is expired.
<
> be cleared when a heap tuple is expired. Another idea is to maintain
> a bitmap of heap pages where all rows are visible to all backends,
> and allow index lookups to reference that bitmap to avoid heap
> lookups, perhaps the same bitmap we might add someday to determine
> which heap pages need vacuuming.
Bruce Momjian [Thu, 1 Dec 2005 22:07:59 +0000 (22:07 +0000)]
Split out MERGE and REPLACE/UPSERT items.
< * Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
< triggers?)
> * Add SQL-standard MERGE command, typically used to merge two tables
>
> This is similar to UPDATE, then for unmatched rows, INSERT.
> Whether concurrent access allows modifications which could cause
> row loss is implementation independent.
>
> * Add REPLACE or UPSERT command that does UPDATE, or on failure, INSERT
Tom Lane [Thu, 1 Dec 2005 20:24:18 +0000 (20:24 +0000)]
Retry in FileRead and FileWrite if Windows returns ERROR_NO_SYSTEM_RESOURCES.
Also add a retry for Unixen returning EINTR, which hasn't been reported
as an issue but at least theoretically could be. Patch by Qingqing Zhou,
some minor adjustments by me.
Tom Lane [Wed, 30 Nov 2005 17:10:19 +0000 (17:10 +0000)]
Tweak choose_bitmap_and() heuristics in the light of example provided in bug
#2075: consider an index redundant if any of its index conditions were already
used, rather than if all of them were. Also, make the selectivity comparison
a bit fuzzy, so that very small differences in estimated selectivities don't
skew the results.
Michael Meskes [Wed, 30 Nov 2005 12:49:49 +0000 (12:49 +0000)]
- Made several variables "const char *" instead of "char *" as proposed by Qingqing Zhou <zhouqq@cs.toronto.edu>.
- Replaced all strdup() calls by ECPGstrdup().
- Set ecpg library version to 5.2.
- Set ecpg version to 4.2.1.
Bruce Momjian [Tue, 29 Nov 2005 02:02:40 +0000 (02:02 +0000)]
Update for 8.2:
< #A hyphen, "-", marks changes that will appear in the upcoming 8.1 release.#
> #A hyphen, "-", marks changes that will appear in the upcoming 8.2 release.#
Tom Lane [Tue, 29 Nov 2005 01:25:50 +0000 (01:25 +0000)]
Fix EXPLAIN and EXECUTE commands to pass portal parameters through to
the executor. This allows, for example, JDBC clients to use '?' bound
parameters in these commands. Per gripe from Virag Saksena.
Tom Lane [Mon, 28 Nov 2005 23:46:03 +0000 (23:46 +0000)]
Tweak hash join code to use an additional heuristic for deciding whether
it's worth probing the outer relation for emptiness before building the
hash table. To wit, if we're rescanning a join previously performed,
remember whether we found it nonempty the previous time, and don't bother
with the probe if it was nonempty. This buys back the performance lost
in examples like Mario Weilguni's.
Tom Lane [Mon, 28 Nov 2005 17:14:23 +0000 (17:14 +0000)]
Recent changes to allow hash join to exit early given empty input from
one child or the other had a problem: they did not leave the node in a
state that ExecReScanHashJoin would understand. In particular it would
tend to fail to reset the child plans when needed. Per report from
Mario Weilguni.
Tom Lane [Mon, 28 Nov 2005 04:35:32 +0000 (04:35 +0000)]
Change the parser to translate "foo [NOT] IN (expression-list)" to
ScalarArrayOpExpr when possible, that is, whenever there is an array type
for the values of the expression list. This completes the project I've
been working on to improve the speed of index searches with long IN lists,
as per discussion back in mid-October.
I did not force initdb, but until you do one you will see failures in the
"rules" regression test, because some of the standard system views use IN
and their compiled formats have changed.
Tom Lane [Sun, 27 Nov 2005 22:15:42 +0000 (22:15 +0000)]
Teach predtest.c how to reason about ScalarArrayOpExpr clauses as though
they were broken-out AND or OR lists. The least grotty way to do this
seemed to be to set up a general mechanism for handling nodes as though
they were ANDs or ORs. There's no other immediate use for it, but perhaps
we might want to use the mechanism someday for things like BETWEEN
SYMMETRIC.
Tom Lane [Sat, 26 Nov 2005 22:14:57 +0000 (22:14 +0000)]
Teach tid-scan code to make use of "ctid = ANY (array)" clauses, so that
"ctid IN (list)" will still work after we convert IN to ScalarArrayOpExpr.
Make some minor efficiency improvements while at it, such as ensuring that
multiple TIDs are fetched in physical heap order. And fix EXPLAIN so that
it shows what's really going on for a TID scan.
Tom Lane [Sat, 26 Nov 2005 03:03:07 +0000 (03:03 +0000)]
Change seqscan logic so that we check visibility of all tuples on a page
when we first read the page, rather than checking them one at a time.
This allows us to take and release the buffer content lock just once
per page, instead of once per tuple. Since it's a shared lock the
contention penalty for holding the lock longer shouldn't be too bad.
We can safely do this only when using an MVCC snapshot; else the
assumption that visibility won't change over time is uncool. Therefore
there are now two code paths depending on the snapshot type. I also
made the same change in nodeBitmapHeapscan.c, where it can be done always
because we only support MVCC snapshots for bitmap scans anyway.
Also make some incidental cleanups in the APIs of these functions.
Per a suggestion from Qingqing Zhou.
Tom Lane [Fri, 25 Nov 2005 19:47:50 +0000 (19:47 +0000)]
Teach planner and executor to handle ScalarArrayOpExpr as an indexable
qualification when the underlying operator is indexable and useOr is true.
That is, indexkey op ANY (ARRAY[...]) is effectively translated into an
OR combination of one indexscan for each array element. This only works
for bitmap index scans, of course, since regular indexscans no longer
support OR'ing of scans. There are still some loose ends to clean up
before changing 'x IN (list)' to translate as a ScalarArrayOpExpr;
for instance predtest.c ought to be taught about it. But this gets the
basic functionality in place.
Tom Lane [Fri, 25 Nov 2005 04:24:48 +0000 (04:24 +0000)]
Improve ExecStoreTuple to be smarter about replacing the contents of
a TupleTableSlot: instead of calling ExecClearTuple, inline the needed
operations, so that we can avoid redundant steps. In particular, when
the old and new tuples are both on the same disk page, avoid releasing
and re-acquiring the buffer pin --- this saves work in both the bufmgr
and ResourceOwner modules. To make this improvement actually useful,
partially revert a change I made on 2004-04-21 that caused SeqNext
et al to call ExecClearTuple before ExecStoreTuple. The motivation
for that, to avoid grabbing the BufMgrLock separately for releasing
the old buffer and grabbing the new one, no longer applies. My
profiling says that this saves about 5% of the CPU time for an
all-in-memory seqscan.
Tom Lane [Wed, 23 Nov 2005 20:27:58 +0000 (20:27 +0000)]
Get rid of ExecAssignResultTypeFromOuterPlan() and make all plan node types
generate their output tuple descriptors from their target lists (ie, using
ExecAssignResultTypeFromTL()). We long ago fixed things so that all node
types have minimally valid tlists, so there's no longer any good reason to
have two different ways of doing it. This change is needed to fix bug
reported by Hayden James: the fix of 2005-11-03 to emit the correct column
names after optimizing away a SubqueryScan node didn't work if the new
top-level plan node used ExecAssignResultTypeFromOuterPlan to generate its
tupdesc, since the next plan node down won't have the correct column labels.
Tom Lane [Wed, 23 Nov 2005 17:21:04 +0000 (17:21 +0000)]
Fix problems with rewriter failing to set Query.hasSubLinks when inserting
a SubLink expression into a rule query. Pre-8.1 we essentially did this
unconditionally; 8.1 tries to do it only when needed, but was missing a
couple of cases. Per report from Kyle Bateman. Add some regression test
cases covering this area.