While playing around, I got the following error message:
--
FATAL: pre-existing shared memory block (key 5432001, ID 90898435) is
still in use
HINT: If you're sure there are no old server processes still running,
remove the shared memory block with the command "ipcrm", or just delete
the file "/home/hlinnaka/pgsql/data/postmaster.pid".
---
Thats normal because I used "kill -9 postmaster" to shut down.
The hint advises me to use "ipcrm", but there's the "ipcclean" script in
bin for just this purpose. The hint should probably advise to use
ipcclean.
The attached patch replaces all occurances of "ipcrm" with "ipcclean" in
src/backend/utils/init/miscinit.c and all the translations in
src/backend/po.
While reviewing the patch, I noticed a likely typo in hr.po. While I
don't
speak Croatian, the translation seems to advise to use the "icpm(1)"
command. I changed that to "ipcclean" too.
Tom Lane [Mon, 6 Jun 2005 20:22:58 +0000 (20:22 +0000)]
Modify XLogInsert API to make callers specify whether pages to be backed
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero. That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space. Per suggestion by Heikki Linnakangas.
Tom Lane [Mon, 6 Jun 2005 17:01:25 +0000 (17:01 +0000)]
Remove the mostly-stubbed-out-anyway support routines for WAL UNDO.
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
Tom Lane [Mon, 6 Jun 2005 04:13:36 +0000 (04:13 +0000)]
Nab some low-hanging fruit: replace the planner's base_rel_list and
other_rel_list with a single array indexed by rangetable index.
This reduces find_base_rel from O(N) to O(1) without any real penalty.
While find_base_rel isn't one of the major bottlenecks in any profile
I've seen so far, it was starting to creep up on the radar screen
for complex queries --- so might as well fix it.
Tom Lane [Sun, 5 Jun 2005 22:32:58 +0000 (22:32 +0000)]
Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code. This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
Bruce Momjian [Sun, 5 Jun 2005 03:39:54 +0000 (03:39 +0000)]
Add description for backend termination:
< cleaned up properly. A new signal is needed for safe termination.
> cleaned up properly. A new signal is needed for safe termination
> because backends must first do a query cancel, then exit once they
> have run the query cancel cleanup routine.
Tom Lane [Sun, 5 Jun 2005 00:38:11 +0000 (00:38 +0000)]
Replace the parser's namespace tree (which formerly had the same
representation as the jointree) with two lists of RTEs, one showing
the RTEs accessible by qualified names, and the other showing the RTEs
accessible by unqualified names. I think this is conceptually simpler
than what we did before, and it's sure a whole lot easier to search.
This seems to eliminate the parse-time bottleneck for deeply nested
JOIN structures that was exhibited by phil@vodafone.
Bruce Momjian [Sun, 5 Jun 2005 00:28:36 +0000 (00:28 +0000)]
Add TODO.detail.
< logs
> logs [pitr] 130c130
< * Allow a warm standby system to also allow read-only queries
> * Allow a warm standby system to also allow read-only queries [pitr]
Tom Lane <tgl@sss.pgh.pa.us> writes:
> a_ogawa <a_ogawa@hi-ho.ne.jp> writes:
> > It is a reasonable idea. However, the majority part of MemSet was not
> > able to be avoided by this idea. Because the per-tuple contexts are used
> > at the early stage of executor.
>
> Drat. Well, what about changing that? We could introduce additional
> contexts or change the startup behavior so that the ones that are
> frequently reset don't have any data in them unless you are working
> with pass-by-ref values inside the inner loop.
That might be possible. However, I think that we should change only
aset.c about this article.
I thought further: We can check whether context was used from the last
reset even when blocks list is not empty. Please see attached patch.
Here's an updated version of the patch, with the following changes:
1) No longer uses "service name" as "application version". It's instead
hardcoded as "postgres". It could be argued that this part should be
backpatched to 8.0, but it doesn't make a big difference until you can
start changing it with GUC / connection parameters. This change only
affects kerberos 5, not 4.
2) Now downcases kerberos usernames when the client is running on win32.
3) Adds guc option for "krb_caseins_users" to make the server ignore
case mismatch which is required by some KDCs such as Active Directory.
Off by default, per discussion with Tom. This change only affects
kerberos 5, not 4.
4) Updated so it doesn't conflict with the rendevouz/bonjour patch
already in ;-)
Bruce Momjian [Sat, 4 Jun 2005 20:33:06 +0000 (20:33 +0000)]
At 2005-05-21 20:18:50 +0530, ams@oryx.com wrote:
>
> > The second issue is where plperl returns a large result set.
I have attached the following seven patches to address this problem:
1. Trivial. Replaces some errant spaces with tabs.
2. Trivial. Fixes the spelling of Jan's name, and gets rid of many
inane, useless, annoying, and often misleading comments. Here's
a sample: "plperl_init_all() - Initialize all".
(I have tried to add some useful comments here and there, and will
continue to do so now and again.)
3. Trivial. Splits up some long lines.
4. Converts SRFs in PL/Perl to use a Tuplestore and SFRM_Materialize
to return the result set, based on the PL/PgSQL model.
There are two major consequences: result sets will spill to disk when
they can no longer fit in work_mem; and "select foo_srf()" no longer
works. (I didn't lose sleep over the latter, since that form is not
valid in PL/PgSQL, and it's not documented in PL/Perl.)
5. Trivial, but important. Fixes use of "undef" instead of undef. This
would cause empty functions to fail in bizarre ways. I suspect that
there's still another (old) bug here. I'll investigate further.
6. Moves the majority of (4) out into a new plperl_return_next()
function, to make it possible to expose the functionality to
Perl; cleans up some of the code besides.
7. Add an spi_return_next function for use in Perl code.
If you want to apply the patches and try them out, 8-composite.diff is
what you should use. (Note: my patches depend upon Andrew's use-strict
and %_SHARED patches being applied.)
Here's something to try:
create or replace function foo() returns setof record as $$
$i = 0;
for ("World", "PostgreSQL", "PL/Perl") {
spi_return_next({f1=>++$i, f2=>'Hello', f3=>$_});
}
return;
$$ language plperl;
select * from foo() as (f1 integer, f2 text, f3 text);
(Many thanks to Andrews Dunstan and Supernews for their help.)
Bruce Momjian [Sat, 4 Jun 2005 20:14:12 +0000 (20:14 +0000)]
Tom Lane <tgl@sss.pgh.pa.us> writes:
> a_ogawa <a_ogawa@hi-ho.ne.jp> writes:
> > It is a reasonable idea. However, the majority part of MemSet was not
> > able to be avoided by this idea. Because the per-tuple contexts are used
> > at the early stage of executor.
>
> Drat. Well, what about changing that? We could introduce additional
> contexts or change the startup behavior so that the ones that are
> frequently reset don't have any data in them unless you are working
> with pass-by-ref values inside the inner loop.
That might be possible. However, I think that we should change only
aset.c about this article.
I thought further: We can check whether context was used from the last
reset even when blocks list is not empty. Please see attached patch.
The effect of the patch that I measured is as follows:
o Execution time that executed the SQL ten times.
(1)Linux(CPU: Pentium III, Compiler option: -O2)
- original: 24.960s
- patched : 23.114s
Tom Lane [Sat, 4 Jun 2005 19:19:42 +0000 (19:19 +0000)]
Change expandRTE() and ResolveNew() back to taking just the single
RTE of interest, rather than the whole rangetable list. This makes
the API more understandable and avoids duplicate RTE lookups. This
patch reverts no-longer-needed portions of my patch of 2004-08-19.
Bruce Momjian [Sat, 4 Jun 2005 18:12:38 +0000 (18:12 +0000)]
Add:
> * Allow pg_ctl to work properly with configuration files located outside
> the PGDATA directory
>
> pg_ctl can not read the pid file because it isn't located in the
> config directory but in the PGDATA directory. The solution is to
> allow pg_ctl to read and understand postgresql.conf to find the
> data_directory value.
>
Neil Conway [Sat, 4 Jun 2005 02:07:09 +0000 (02:07 +0000)]
Remove unused 'printCost' field from ExplainState, and simplify the code
accordingly (this field was always initialized to true). Patch from
Alvaro Herrera.
Tom Lane [Fri, 3 Jun 2005 23:05:30 +0000 (23:05 +0000)]
Revise handling of dropped columns in JOIN alias lists to avoid a
performance problem pointed out by phil@vodafone: to wit, we were
spending O(N^2) time to check dropped-ness in an N-deep join tree,
even in the case where the tree was freshly constructed and couldn't
possibly mention any dropped columns. Instead of recursing in
get_rte_attribute_is_dropped(), change the data structure definition:
the joinaliasvars list of a JOIN RTE must have a NULL Const instead
of a Var at any position that references a now-dropped column. This
costs nothing during normal parse-rewrite-plan path, and instead we
have a linear-time update to make when loading a stored rule that
might contain now-dropped columns. While at it, move the responsibility
for acquring locks on relations referenced by rules into this separate
function (which I therefore chose to call AcquireRewriteLocks).
This saves effort --- namely, duplicated lock grabs in parser and rewriter
--- in the normal path at a cost of one extra non-locked heap_open()
in the stored-rule path; seems a good tradeoff. A fringe benefit is
that it is now *much* clearer that we acquire lock on relations referenced
in rules before we make any rewriter decisions based on their properties.
(I don't know of any bug of that ilk, but it wasn't exactly clear before.)
Tom Lane [Fri, 3 Jun 2005 19:00:12 +0000 (19:00 +0000)]
Just noticed that you can't Query-Cancel a long planner run, because
no part of the planner did CHECK_FOR_INTERRUPTS(). Add one in a
suitably strategic spot.
Tom Lane [Thu, 2 Jun 2005 21:03:25 +0000 (21:03 +0000)]
Push enable/disable of notify and catchup interrupts all the way down
to just around the bare recv() call that gets a command from the client.
The former placement in PostgresMain was unsafe because the intermediate
processing layers (especially SSL) use facilities such as malloc that are
not necessarily re-entrant. Per report from counterstorm.com.
Michael Meskes [Thu, 2 Jun 2005 12:35:11 +0000 (12:35 +0000)]
- Fixed memory leak in ecpglib by adding some missing free() commands.
- Added patch by Gavin Scott <gavin@planetacetech.com> for Intel 64bit hardware.
Tom Lane [Thu, 2 Jun 2005 05:55:29 +0000 (05:55 +0000)]
Change CRCs in WAL records from 64bit to 32bit for performance reasons.
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block. Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages. Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective. All per recent discussions.
Tom Lane [Wed, 1 Jun 2005 17:05:11 +0000 (17:05 +0000)]
patternsel() was improperly stripping RelabelType from the derived
expressions it constructed, causing scalarineqsel to become confused
if the underlying variable was of a domain type. Per report from
Kevin Grittner.
Tom Lane [Tue, 31 May 2005 19:10:28 +0000 (19:10 +0000)]
Add test to WAL replay to verify that xl_prev points back to the previous
WAL record; this is necessary to be sure we recognize stale WAL records
when a WAL page was only partially written during a system crash.
Tom Lane [Tue, 31 May 2005 03:03:59 +0000 (03:03 +0000)]
Teach ruleutils to drill down into RECORD-type Vars in the same way
that the parser now can, so that it can reverse-list cases involving
FieldSelect from a RECORD Var.
Tom Lane [Tue, 31 May 2005 01:03:23 +0000 (01:03 +0000)]
ParseComplexProjection should make use of expandRecordVariable so that
it can handle cases like (foo.x).y where foo is a subquery and x is
a function-returning-RECORD RTE in that subquery.
Tom Lane [Mon, 30 May 2005 23:09:07 +0000 (23:09 +0000)]
Document get_call_result_type() and friends; mark TypeGetTupleDesc()
and RelationNameGetTupleDesc() as deprecated; remove uses of the
latter in the contrib library. Along the way, clean up crosstab()
code and documentation a little.
Bruce Momjian [Mon, 30 May 2005 21:12:23 +0000 (21:12 +0000)]
Move to ALTER section:
< * Prevent child tables from altering constraints like CHECK that were
< inherited from the parent table 470a469,471
>
> o Prevent child tables from altering constraints like CHECK that were
> inherited from the parent table
Tom Lane [Mon, 30 May 2005 18:55:49 +0000 (18:55 +0000)]
Add support for FUNCTION RTEs to build_physical_tlist(), so that the
physical-tlist optimization can be applied to FunctionScan nodes as well
as regular tables and SubqueryScans.
Bruce Momjian [Mon, 30 May 2005 14:50:35 +0000 (14:50 +0000)]
Have psql escape bytes in strings for variables follow the backend
conventions of only allowing octal, like \045. Remove support for
\decimal, \0octal, and \0xhex which matches the strtol() function but
didn't make sense with backslashes.
These now return the same character:
test=> \set x '\54'
test=> \echo :x
,
test=> \set x '\054'
test=> \echo :x
,
Neil Conway [Mon, 30 May 2005 07:20:59 +0000 (07:20 +0000)]
When enqueueing after-row triggers for updates of a table with a foreign
key, compare the new and old row versions. If the foreign key column has
not changed, we needn't enqueue the trigger, since the update cannot
violate the foreign key. This optimization was previously applied in the
RI trigger function, but it is more efficient to avoid firing the trigger
altogether. Per recent discussion on pgsql-hackers.
Also add a regression test for some unintuitive foreign key behavior, and
refactor some code that deals with the OIDs of the various RI trigger
functions.
Neil Conway [Mon, 30 May 2005 06:52:38 +0000 (06:52 +0000)]
Create separate ON INSERT and ON UPDATE triggers on tables with foreign
keys, rather than a single trigger for both events. This should not change
functionality, but it is more consistent: previously, there were trigger
functions for both "check_insert" and "check_update", but the former was
used for both events.
Bump catalog version number (not strictly necessary, but best to be
cautious).
Tom Lane [Mon, 30 May 2005 01:20:50 +0000 (01:20 +0000)]
Change the UNKNOWN type to have an internal representation matching
cstring, rather than text, so as to eliminate useless conversions
inside the parser. Per recent discussion.
Tom Lane [Mon, 30 May 2005 01:04:44 +0000 (01:04 +0000)]
Skip eval_const_expressions when the query is such that the expression
would be evaluated only once anyway (ie, it's just a SELECT with no
FROM or an INSERT ... VALUES). The planner can't do it any faster than
the executor, so no point in an extra copying of the expression tree.
Tom Lane [Sun, 29 May 2005 23:38:05 +0000 (23:38 +0000)]
Avoid unnecessary fetch from pg_shadow in the normal case in
pg_class_aclmask(). We only need to do this when we have to check
pg_shadow.usecatupd, and that's not relevant unless the target table
is a system catalog. So we can usually avoid one syscache lookup.
Tom Lane [Sun, 29 May 2005 22:45:02 +0000 (22:45 +0000)]
Improve LockAcquire API per my recent proposal. All error conditions
are now reported via elog, eliminating the need to test the result code
at most call sites. Make it possible for the caller to distinguish a
freshly acquired lock from one already held in the current transaction.
Use that capability to avoid redundant AcceptInvalidationMessages() calls
in LockRelation().
Tom Lane [Sun, 29 May 2005 20:38:06 +0000 (20:38 +0000)]
Make superuser.c maintain a simple one-entry cache holding the superuser
status of the most recently queried userid. Since the common pattern is
many successive queries about the same user (ie, the current user) this
can save a lot of syscache probes.
Tom Lane [Sun, 29 May 2005 18:24:14 +0000 (18:24 +0000)]
Remove typeidIsValid() checks in can_coerce_type(). These checks
were pretty expensive and I believe the case they were put in to
defend against can no longer arise, now that we have dependency checks
to prevent deletion of a type entry that is still referenced. Certainly
the example given in the CVS log entry can't happen anymore.
Since this was the only use of typeidIsValid(), remove the routine too.
Tom Lane [Sun, 29 May 2005 17:10:23 +0000 (17:10 +0000)]
expandRTE and get_rte_attribute_type mistakenly always imputed typmod -1
to columns of an RTE that was a function returning RECORD with a column
definition list. Apparently no one has tried to use non-default typmod
with a function returning RECORD before.
Tom Lane [Sun, 29 May 2005 04:23:07 +0000 (04:23 +0000)]
Modify hash_search() API to prevent future occurrences of the error
spotted by Qingqing Zhou. The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places. If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL. But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
Tom Lane [Sat, 28 May 2005 17:21:32 +0000 (17:21 +0000)]
Bgwriter should PANIC if it runs out of memory for pending-fsyncs
hash table. This is a pretty unlikely scenario, since the table
should be tiny, but we can't guarantee continued correct operation
if it does occur. Spotted by Qingqing Zhou.
Tom Lane [Sat, 28 May 2005 05:10:47 +0000 (05:10 +0000)]
get_expr_result_type has to be prepared to pull type information
from a RECORD Const node, because that's what it may be faced with
after constant-folding of a function returning RECORD. Per example
from Michael Fuhr.
Bruce Momjian [Sat, 28 May 2005 04:12:13 +0000 (04:12 +0000)]
Remove:
<
< * Add XML output to pg_dump and COPY
<
< We already allow XML to be stored in the database, and XPath queries
< can be used on that data using /contrib/xml2. It also supports XSLT
< transformations.
Tom Lane [Fri, 27 May 2005 23:31:21 +0000 (23:31 +0000)]
Arrange to cache fmgr lookup information for an index's access method
routines in the index's relcache entry, instead of doing a fresh fmgr_info
on every index access. We were already doing this for the index's opclass
support functions; not sure why we didn't think to do it for the AM
functions too. This supersedes the former method of caching (only)
amgettuple in indexscan scan descriptors; it's an improvement because the
function lookup can be amortized across multiple statements instead of
being repeated for each statement. Even though lookup for builtin
functions is pretty cheap, this seems to drop a percent or two off some
simple benchmarks.
Bruce Momjian [Fri, 27 May 2005 22:07:26 +0000 (22:07 +0000)]
Add:
> * Consider sorting hash buckets so entries can be found using a binary
> search, rather than a linear scan
> * In hash indexes, consider storing the hash value with or instead
> of the key itself
Bruce Momjian [Fri, 27 May 2005 22:01:18 +0000 (22:01 +0000)]
Add:
> * Add the features of packages
> o Make private objects accessable only to objects in the same schema
> o Allow current_schema.objname to access current schema objects
> o Add session variables
> o Allow nested schemas
Bruce Momjian [Fri, 27 May 2005 21:31:23 +0000 (21:31 +0000)]
Display only 9 subsecond digits instead of 10 for time values, for
consistency and to prevent rounding for days < 30. Also round off all
trailing zeros, rather than leaving an even number of digits.