Robert Haas [Mon, 24 Jan 2011 01:44:48 +0000 (20:44 -0500)]
sepgsql, an SE-Linux integration for PostgreSQL
This is still pretty rough - among other things, the documentation
needs work, and the messages need a visit from the style police -
but this gets the basic framework in place.
Magnus Hagander [Sun, 23 Jan 2011 22:39:18 +0000 (23:39 +0100)]
Make walsender options order-independent
While doing this, also move base backup options into
a struct instead of increasing the number of parameters
to multiple functions for each new option.
Add 'directory' format to pg_dump. The new directory format is compatible
with the 'tar' format, in that untarring a tar format archive produces a
valid directory format archive.
Tom Lane [Sun, 23 Jan 2011 19:26:51 +0000 (14:26 -0500)]
Fix another portability issue in pg_basebackup.
The target of sscanf with a %o format had better be of integer width,
but "mode_t" conceivably isn't that. Another compiler warning seen
only on some platforms; this one I think is potentially a real bug
and not just a warning.
Tom Lane [Sun, 23 Jan 2011 19:13:46 +0000 (14:13 -0500)]
Improve getObjectDescription's display of pg_amop and pg_amproc entries.
Include the lefttype/righttype columns explicitly (instead of assuming
the reader can deduce them from the operator or function description),
and move the operator or function description to the end of the string,
to make it clearer that it's a referenced object and not the amop or
amproc item itself. Per extensive discussion of Andreas Karlsson's
original patch.
Tom Lane [Sun, 23 Jan 2011 18:12:55 +0000 (13:12 -0500)]
Revert "Factor out functions responsible for caching I/O routines".
This reverts commit 740e54ca84c437fd67524f97a3ea9ddea752e208, which seems
to have tickled an optimization bug in gcc 4.5.x, as reported upstream at
https://bugzilla.redhat.com/show_bug.cgi?id=671899
Since this patch had no purpose beyond code beautification, it's not
worth expending a lot of effort to look for another workaround.
Magnus Hagander [Sun, 23 Jan 2011 11:21:23 +0000 (12:21 +0100)]
Add pg_basebackup tool for streaming base backups
This tool makes it possible to do the pg_start_backup/
copy files/pg_stop_backup step in a single command.
There are still some steps to be done before this is a
complete backup solution, such as the ability to stream
the required WAL logs, but it's still usable, and
could do with some buildfarm coverage.
In passing, make the checkpoint request optionally
fast instead of hardcoding it.
Magnus Hagander, reviewed by Fujii Masao and Dimitri Fontaine
Robert Haas [Sun, 23 Jan 2011 01:51:32 +0000 (20:51 -0500)]
Code cleanup for assign_transaction_read_only.
As in commit fb4c5d2798730f60b102d775f22fb53c26a6445d on 2011-01-21,
this avoids spurious debug messages and allows idempotent changes at
any time. Along the way, make assign_XactIsoLevel allow idempotent
changes even when not within a subtransaction, to be consistent with
the new coding of assign_transaction_read_only and because there's
no compelling reason to do otherwise.
Tom Lane [Sun, 23 Jan 2011 01:43:54 +0000 (20:43 -0500)]
Quick hack to un-break plpython regression tests.
It's not clear to me what should happen to the other plpython_unicode
variant expected files, but this patch gets things passing on my own
machines and at least some of the buildfarm.
Tom Lane [Sun, 23 Jan 2011 01:31:24 +0000 (20:31 -0500)]
Allow the wal_buffers setting to be auto-tuned to a reasonable value.
If wal_buffers is initially set to -1 (which is now the default), it's
replaced by 1/32nd of shared_buffers, with a minimum of 8 (the old default)
and a maximum of the XLOG segment size. The allowed range for manual
settings is still from 4 up to whatever will fit in shared memory.
Tom Lane [Sat, 22 Jan 2011 23:01:31 +0000 (18:01 -0500)]
Suppress "control reaches end of non-void function" warning from gcc 4.5.
Not sure why I'm seeing this on Fedora 14 and not earlier versions.
Seems like a regression that gcc no longer knows that DIE() doesn't return.
Still, adding a dummy return is harmless enough.
Tom Lane [Sat, 22 Jan 2011 22:56:42 +0000 (17:56 -0500)]
Suppress possibly-uninitialized-variable warnings from gcc 4.5.
It appears that gcc 4.5 can issue such warnings for whole structs, not
just scalar variables as in the past. Refactor some pg_dump code slightly
so that the OutputContext local variables are always initialized, even
if they won't be used. It's cheap enough to not be worth worrying about.
Peter Eisentraut [Sat, 22 Jan 2011 20:08:51 +0000 (22:08 +0200)]
Get rid of the global variable holding the error state
Global error handling led to confusion and was hard to manage. With
this change, errors from PostgreSQL are immediately reported to Python
as exceptions. This requires setting a Python exception after
reporting the caught PostgreSQL error as a warning, because PLy_elog
destroys the Python exception state.
Ideally, all places where PostgreSQL errors need to be reported back
to Python should be wrapped in subtransactions, to make going back to
Python from a longjmp safe. This will be handled in a separate patch.
Tom Lane [Sat, 22 Jan 2011 20:01:26 +0000 (15:01 -0500)]
More pg_test_fsync fixups.
Reduce #includes to minimum actually needed; in particular include
postgres_fe.h not postgres.h, so as to stop build failures on some
platforms.
Use get_progname() instead of hardwired program name; improve error
checking for command line syntax; bring error messages into line with
style guidelines; include strerror result in die() cases.
Robert Haas [Sat, 22 Jan 2011 02:49:19 +0000 (21:49 -0500)]
Code cleanup for assign_XactIsoLevel.
The new coding avoids a spurious debug message when a transaction
that has changed the isolation level has been rolled back. It also
allows the property to be freely changed to the current value within
a subtransaction.
Tom Lane [Sat, 22 Jan 2011 00:44:53 +0000 (19:44 -0500)]
More pg_test_fsync cleanup.
Un-break Windows build (I hope) by making the HAVE_FSYNC_WRITETHROUGH
code match the backend. Fix incorrect program help message. static-ize
all functions.
Tom Lane [Sat, 22 Jan 2011 00:27:25 +0000 (19:27 -0500)]
Clean up pg_test_fsync commit.
Actually rename the program, rather than just claiming we did. Hook it
into the build system. Get rid of useless dependency on libpq. Clean up
#include list and messy whitespace.
Peter Eisentraut [Fri, 21 Jan 2011 21:46:56 +0000 (23:46 +0200)]
Correctly add exceptions to the plpy module for Python 3
The way the exception types where added to the module was wrong for
Python 3. Exception classes were not actually available from plpy.
Fix that by factoring out code that is responsible for defining new
Python exceptions and make it work with Python 3. New regression test
makes sure the plpy module has the expected contents.
Don't require usage privileges on the foreign data wrapper when creating a
foreign table. We check for usage privileges on the foreign server, that ought
to be enough.
Peter Eisentraut [Tue, 18 Jan 2011 21:39:09 +0000 (23:39 +0200)]
Skip dropped attributes when converting Python objects to tuples
Pay attention to the attisdropped field and skip over TupleDesc fields
that have it set. Not a real problem until we get table returning
functions, but it's the right thing to do anyway.
Peter Eisentraut [Tue, 18 Jan 2011 21:22:37 +0000 (23:22 +0200)]
Fix an error when a set-returning function fails halfway through the execution
If the function using yield to return rows fails halfway, the iterator
stays open and subsequent calls to the function will resume reading
from it. The fix is to unref the iterator and set it to NULL if there
has been an error.
Tom Lane [Tue, 18 Jan 2011 19:09:22 +0000 (14:09 -0500)]
Avoid detoast in texteq/textne/byteaeq/byteane for unequal-length strings.
We can get the length of a compressed or out-of-line datum without actually
detoasting it. If the lengths of two strings are unequal, we can then
conclude they are unequal without detoasting. That saves considerable work
in an admittedly less-common case, without costing anything much when the
optimization doesn't apply.
Peter Eisentraut [Mon, 17 Jan 2011 19:46:36 +0000 (21:46 +0200)]
Use HTABs instead of Python dictionary objects to cache procedures
Two separate hash tables are used for regular procedures and for
trigger procedures, since the way trigger procedures work is quite
different from normal stored procedures. Change the signatures of
PLy_procedure_{get,create} to accept the function OID and a Boolean
flag indicating whether it's a trigger. This should make implementing
a PL/Python validator easier.
Using HTABs instead of Python dictionaries makes error recovery
easier, and allows for procedures to be cached based on their OIDs,
not their names. It also allows getting rid of the PyCObject field
that used to hold a pointer to PLyProcedure, since PyCObjects are
deprecated in Python 2.7 and replaced by Capsules in Python 3.
Tom Lane [Mon, 17 Jan 2011 17:38:52 +0000 (12:38 -0500)]
Fix miscalculation of itemsafter in array_set_slice().
If the slice to be assigned to was before the existing array lower bound
(requiring at least one null element to spring into existence to fill the
gap), the code miscalculated how many entries needed to be copied from
the old array's null bitmap. This could result in trashing the array's
data area (as seen in bug #5840 from Karsten Loesing), or worse.
This has been broken since we first allowed the behavior of assigning to
non-adjacent slices, in 8.2. Back-patch to all affected versions.
Before exiting walreceiver, fsync() all the WAL received.
Otherwise WAL recovery will replay the un-flushed WAL after walreceiver has
exited, which can lead to a non-recoverable standby if the system crashes hard
at that point.
Magnus Hagander [Sat, 15 Jan 2011 18:18:14 +0000 (19:18 +0100)]
Enumerate available tablespaces after starting the backup
This closes a race condition where if a tablespace was created
after the enumeration happened but before the do_pg_start_backup()
was called, the backup would be incomplete. Now that it's done
while we are in backup mode, WAL replay will recreate it during
restore.
Treat a WAL sender process that hasn't started streaming yet as a regular
backend, as far as the postmaster shutdown logic is concerned. That means,
fast shutdown will wait for WAL sender processes to exit before signaling
bgwriter to finish. This avoids race conditions between a base backup stopping
or starting, and bgwriter writing the shutdown checkpoint WAL record. We don't
want e.g the end-of-backup WAL record to be written after the shutdown
checkpoint.
Magnus Hagander [Fri, 14 Jan 2011 15:30:33 +0000 (16:30 +0100)]
Use a lexer and grammar for parsing walsender commands
Makes it easier to parse mainly the BASE_BACKUP command
with it's options, and avoids having to manually deal
with quoted identifiers in the label (previously broken),
and makes it easier to add new commands and options in
the future.
In passing, refactor the case statement in the walsender
to put each command in it's own function.
Tom Lane [Fri, 14 Jan 2011 00:01:28 +0000 (19:01 -0500)]
Code review for postmaster.pid contents changes.
Fix broken test for pre-existing postmaster, caused by wrong code for
appending lines to the lockfile; don't write a failed listen_address
setting into the lockfile; don't arbitrarily change the location of the
data directory in the lockfile compared to previous releases; provide more
consistent and useful definitions of the socket path and listen_address
entries; avoid assuming that pg_ctl has the same DEFAULT_PGSOCKET_DIR as
the postmaster; assorted code style improvements.
Tom Lane [Thu, 13 Jan 2011 19:33:19 +0000 (14:33 -0500)]
Revert incorrect memory-conservation hack in inheritance_planner().
This reverts commit d1001a78ce612a16ea622b558f5fc2b68c45ab4c of 2010-12-05,
which was broken as reported by Jeff Davis. The problem is that the
individual planning steps may have side-effects on substructures of
PlannerGlobal, not only the current PlannerInfo root. Arranging to keep
all such side effects in the main planning context is probably possible,
but it would change this from a quick local hack into a wide-ranging and
rather fragile endeavor. Which it's not worth.
Fix the logic in libpqrcv_receive() to determine if there's any incoming data
that can be read without blocking. It used to conclude that there isn't, even
though there was data in the socket receive buffer. That lead walreceiver to
flush the WAL after every received chunk, potentially causing big performance
issues.
Backpatch to 9.0, because the performance impact can be very significant.
Peter Eisentraut [Thu, 13 Jan 2011 07:32:06 +0000 (09:32 +0200)]
Workaround for recursive make breakage
Changing a file two directory levels deep under src/backend/ would not
cause the postgres binary to be rebuilt. This change fixes it, but no
one knows why.
Tom Lane [Thu, 13 Jan 2011 01:47:02 +0000 (20:47 -0500)]
Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly.
In an inherited UPDATE/DELETE, each target table has its own subplan,
because it might have a column set different from other targets. This
means that the resjunk columns we add to support EvalPlanQual might be
at different physical column numbers in each subplan. The EvalPlanQual
rewrite I did for 9.0 failed to account for this, resulting in possible
misbehavior or even crashes during concurrent updates to the same row,
as seen in a recent report from Gordon Shannon. Revise the data structure
so that we track resjunk column numbers separately for each subplan.
I also chose to move responsibility for identifying the physical column
numbers back to executor startup, instead of assuming that numbers derived
during preprocess_targetlist would stay valid throughout subsequent
massaging of the plan. That's a bit slower, so we might want to consider
undoing it someday; but it would complicate the patch considerably and
didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
Tom Lane [Tue, 11 Jan 2011 18:41:13 +0000 (13:41 -0500)]
Adjust basebackup.c to suppress compiler warnings.
Some versions of gcc complain about "variable `tablespaces' might be
clobbered by `longjmp' or `vfork'" with the original coding. Fix by
moving the PG_TRY block into a separate subroutine.
Tom Lane [Tue, 11 Jan 2011 17:12:04 +0000 (12:12 -0500)]
Tweak create_index_paths()'s test for whether to consider a bitmap scan.
Per my note of a couple days ago, create_index_paths would refuse to
consider any path at all for GIN indexes if the selectivity estimate came
out as 1.0; not even if you tried to force it with enable_seqscan. While
this isn't really a bad outcome in practice, it could be annoying for
testing purposes. Adjust the test for "is this path only useful for
sorting" so that it doesn't fire on paths with nil pathkeys, which will
include all GIN paths.
Magnus Hagander [Tue, 11 Jan 2011 09:04:54 +0000 (10:04 +0100)]
Reset walsender ps title in the main loop
When in streaming mode we can never get out, so it will never
be required, but after a base backup (or other operations)
we can get back to the loop, so the title needs to be cleared.
Magnus Hagander [Mon, 10 Jan 2011 13:03:55 +0000 (14:03 +0100)]
Backend support for streaming base backups
Add BASE_BACKUP command to walsender, allowing it to stream a
base backup to the client (in tar format). The syntax is still
far from ideal, that will be fixed in the switch to use a proper
grammar for walsender.
No client included yet, will come as a separate commit.