Bruce Momjian [Tue, 4 Mar 2008 01:33:32 +0000 (01:33 +0000)]
Add ideas for concurrent pg_dump and pg_restore:
< * pg_dump
> * pg_dump / pg_restore
> o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
> multiple objects simultaneously
>
> The difficulty with this is getting multiple dump processes to
> produce a single dump output file.
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
>
> o Allow pg_restore to utilize multiple CPUs and I/O channels by
> restoring multiple objects simultaneously
>
> This might require a pg_restore flag to indicate how many
> simultaneous operations should be performed. Only pg_dump's
> -Fc format has the necessary dependency information.
>
> o To better utilize resources, restore data, primary keys, and
> indexes for a single table before restoring the next table
>
> Hopefully this will allow the CPU-I/O load to be more uniform
> for simultaneous restores. The idea is to start data restores
> for several objects, and once the first object is done, to move
> on to its primary keys and indexes. Over time, simultaneous
> data loads and index builds will be running.
>
> o To better utilize resources, allow pg_restore to check foreign
> keys simultaneously, where possible
> o Allow pg_restore to create all indexes of a table
> concurrently, via a single heap scan
>
> This requires a pg_dump -Fc file because that format contains
> the required dependency information.
> http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
>
> o Allow pg_restore to load different parts of the COPY data
> simultaneously
< single heap scan, and have a restore of a pg_dump somehow use it
> single heap scan, and have pg_restore use it
< http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
Bruce Momjian [Mon, 3 Mar 2008 21:00:35 +0000 (21:00 +0000)]
Add another URL for:
o Consider using a ring buffer for COPY FROM
<
< http://archives.postgresql.org/pgsql-hackers/2008-02/msg01080.php
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg01080.php
Bruce Momjian [Mon, 3 Mar 2008 18:45:24 +0000 (18:45 +0000)]
Add:
> * Speed WAL recovery by allowing more than one page to be prefetched
>
> This involves having a separate process that can be told which pages
> the recovery process will need in the near future.
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php
>
Bruce Momjian [Mon, 3 Mar 2008 15:06:55 +0000 (15:06 +0000)]
Add URL's for sequence discussions:
>
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php
>
< o %Have ALTER TABLE RENAME rename SERIAL sequence names
> o Have ALTER TABLE RENAME rename SERIAL sequence names
>
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php
>
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php
Tom Lane [Sat, 1 Mar 2008 19:26:22 +0000 (19:26 +0000)]
Fix another place that was assuming that a local variable declared as
"struct varlena" would be at least word-aligned. Per buildfarm results
from gypsy_moth. I did a little bit of trawling for other instances of
this coding pattern, and didn't find any; but if we turn up any more
of them I think we'd better revert the "char [4]" patch and find another
way of making tuptoaster.c alignment-safe.
Tom Lane [Sat, 1 Mar 2008 03:26:35 +0000 (03:26 +0000)]
Fix unportable usages of tolower(). On signed-char machines, it is necessary
to explicitly cast the output back to char before comparing it to a char
value, else we get the wrong result for high-bit-set characters. Found by
Rolf Jentsch. Also, fix several places where <ctype.h> functions were being
called without casting the argument to unsigned char; this is likewise
unportable, but we keep making that mistake :-(. These found by buildfarm
member salamander, which I will desperately miss if it ever goes belly-up.
Tom Lane [Sat, 1 Mar 2008 02:46:49 +0000 (02:46 +0000)]
Disable the undocumented xmlvalidate() function, which was unintentionally
left in the code though it was not meant to be provided. It represents a
security hole because unprivileged users could use it to look at (at least the
first line of) any file readable by the backend. Fortunately, this is only
possible if the backend was built with XML support, so the damage is at least
mitigated; and 8.3 probably hasn't propagated into any security-critical uses
yet anyway. Per report from Sergey Burladyan.
Tom Lane [Fri, 29 Feb 2008 17:47:41 +0000 (17:47 +0000)]
Reducing the assumed alignment of struct varlena means that the compiler
is also licensed to put a local variable declared that way at an unaligned
address. Which will not work if the variable is then manipulated with
SET_VARSIZE or other macros that assume alignment. So the previous patch
is not an unalloyed good, but on balance I think it's still a win, since
we have very few places that do that sort of thing. Fix the one place in
tuptoaster.c that does it. Per buildfarm results from gypsy_moth
(I'm a bit surprised that only one machine showed a failure).
Magnus Hagander [Fri, 29 Feb 2008 15:31:33 +0000 (15:31 +0000)]
Fix handling of restricted processes for Windows Vista (mainly),
by explicitly adding back the user to the DACL of the new process.
This fixes the failure case when executing as the Administrator
user, which had no permissions left at all after we dropped the
Administrators group.
Neil Conway [Fri, 29 Feb 2008 02:49:39 +0000 (02:49 +0000)]
Fix several memory leaks when rescanning SRFs. Arrange for an SRF's
"multi_call_ctx" to be a distinct sub-context of the EState's per-query
context, and delete the multi_call_ctx as soon as the SRF finishes
execution. This avoids leaking SRF memory until the end of the current
query, which is particularly egregious when the SRF is scanned
multiple times. This change also fixes a leak of the fields of the
AttInMetadata struct in shutdown_MultiFuncCall().
Also fix a leak of the SRF result TupleDesc when rescanning a
FunctionScan node. The TupleDesc is allocated in the per-query context
for every call to ExecMakeTableFunctionResult(), so we should free it
after calling that function. Since the SRF might choose to return
a non-expendable TupleDesc, we only free the TupleDesc if it is
not being reference-counted.
Peter Eisentraut [Wed, 27 Feb 2008 20:31:01 +0000 (20:31 +0000)]
Change expand_subsys function so that it preserves the relative order of
the files passed as argument. This is desirable so that the dtrace rule
in src/backend/Makefile works.
Tom Lane [Wed, 27 Feb 2008 17:44:19 +0000 (17:44 +0000)]
If RelationBuildDesc() fails to open a critical system index, PANIC with
a relevant error message instead of just dumping core. Odd that nobody
reported this before Darren Reed.
Peter Eisentraut [Tue, 26 Feb 2008 16:07:16 +0000 (16:07 +0000)]
In the SSH setup instructions, change
ssh -L 3333:foo.com:5432 joe@foo.com
I think this should be changed to
ssh -L 3333:localhost:5432 joe@foo.com
The reason is that this assumes the postgres server on foo.com allows
connections from foo.com, which is not allowed by the default
listen_addresses setting. Add more detail explaining this.
pointed out by Faheem Mitha
Also change the example port number 3333 to 63333 so no one can complain
that we are stealing a reserved port number.
Peter Eisentraut [Tue, 26 Feb 2008 13:31:40 +0000 (13:31 +0000)]
Create two separate libpq.rc's: One that is built at build time, and one
that is shipped in the distribution, named libpq-dist.rc. This way the
build system doesn't get upset when a distributed file is forcibly
overwritten by during a normal build.
Peter Eisentraut [Tue, 26 Feb 2008 10:45:24 +0000 (10:45 +0000)]
Reorganize some of the exports list generation code. It seems that this
has been reinvented about four different times throughout history (aix,
cygwin, win32, darwin/linux) and a lot of the concepts are actually shared,
which the code now shows better.
Peter Eisentraut [Tue, 26 Feb 2008 07:20:38 +0000 (07:20 +0000)]
We don't need to rebuild objfiles.txt every time an object file changes.
So only rebuild when a makefile changes (which presumably defines the
file list somewhere), and only touch the file if an object changed. The
touch is necessary so the parent make knows something changed and
ultimately rebuilds postgres.
Tom Lane [Tue, 26 Feb 2008 02:54:08 +0000 (02:54 +0000)]
Fix encode(...bytea..., 'escape') so that it converts all high-bit-set byte
values into \nnn octal escape sequences. When the database encoding is
multibyte this is *necessary* to avoid generating invalidly encoded text.
Even in a single-byte encoding, the old behavior seems very hazardous ---
consider for example what happens if the text is transferred to another
database with a different encoding. Decoding would then yield some other
bytea value than what was encoded, which is surely undesirable. Per gripe
from Hernan Gonzalez.
Backpatch to 8.3, but not further. This is a bit of a judgment call, but I
make it on these grounds: pre-8.3 we don't really have much encoding safety
anyway because of the convert() function family, and we would also have much
higher risk of breaking existing apps that may not be expecting this behavior.
8.3 is still new enough that we can probably get away with making this change
in the function's behavior.
Tom Lane [Mon, 25 Feb 2008 23:36:28 +0000 (23:36 +0000)]
Reject year zero during datetime input, except when it's a 2-digit year
(then it means 2000 AD). Formerly we silently interpreted this as 1 BC,
which at best is unwarranted familiarity with the implementation.
It's barely possible that some app somewhere expects the old behavior,
though, so we won't back-patch this into existing release branches.
Tom Lane [Mon, 25 Feb 2008 23:21:01 +0000 (23:21 +0000)]
Fix datetime input to behave correctly for Feb 29 in years BC.
Formerly, DecodeDate attempted to verify the day-of-the-month exactly, but
it was under the misapprehension that it would know whether we were looking
at a BC year or not. In reality this check can't be made until the calling
function (eg DecodeDateTime) has processed all the fields. So, split the
BC adjustment and validity checks out into a new function ValidateDate that
is called only after processing all the fields. In passing, this patch
makes DecodeTimeOnly work for BC inputs, which it never did before.
(The historical veracity of all this is nonexistent, of course, but if
we're going to say we support proleptic Gregorian calendar then we should
do it correctly. In any case the unpatched code is broken because it could
emit dates that it would then reject on re-inputting.)
Per report from Bernd Helmle. Back-patch as far as 8.0; in 7.x we were
not using our own calendar support and so this seems a bit too risky
to put into 7.4.
Peter Eisentraut [Mon, 25 Feb 2008 17:55:42 +0000 (17:55 +0000)]
Link postgres from all object files at once, to avoid the error-prone
SUBSYS.o step and allow for better optimization by the linker.
Instead of partial linking into SUBSYS.o, the list of object files is
assembled in objfiles.txt files that are expanded when the final
linking is done.
Because we are not yet sure how long command lines different platforms
can handle, the old way of linking is still available, by defining the
make variable PARTIAL_LINKING (e.g., make all PARTIAL_LINKING=1). If
we determine that this is necessary for some platforms, then we will
document this in a more prominent place.
Tom Lane [Sat, 23 Feb 2008 19:11:45 +0000 (19:11 +0000)]
Change the declaration of struct varlena so that the length word is
represented as "char ...[4]" not "int32". Since the length word is never
supposed to be accessed via this struct member anyway, this won't break
any existing code that is following the rules. The advantage is that C
compilers will no longer assume that a pointer to struct varlena is
word-aligned, which prevents incorrect optimizations in TOAST-pointer
access and perhaps other places. gcc doesn't seem to do this (at least
not at -O2), but the problem is demonstrable on some other compilers.
I changed struct inet as well, but didn't bother to touch a lot of other
struct definitions in which it wouldn't make any difference because there
were other fields forcing int alignment anyway. Hopefully none of those
struct definitions are used for accessing unaligned Datums.
Tom Lane [Wed, 20 Feb 2008 22:46:24 +0000 (22:46 +0000)]
Rename miscadmin.h's PG_VERSIONSTR macro to PG_BACKEND_VERSIONSTR to
make it a bit clearer what it is, and get rid of duplicate definitions
in initdb and pg_ctl.
Tom Lane [Wed, 20 Feb 2008 22:18:15 +0000 (22:18 +0000)]
Fix mistakes in pg_ctl's code for "start -w" that tries to cope with
non-default settings for the postmaster's port number. The code to parse
command line options and postgresql.conf entries wasn't quite right about
whitespace or quotes, and it was coded in a not-very-readable way too.
Per bug #3969 from Itagaki Takahiro, though this is more extensive than his
proposed patch (which fixed only the whitespace problem).
This code has been broken since it was put in in 8.0, so patch all the way
back.
Tom Lane [Wed, 20 Feb 2008 17:44:09 +0000 (17:44 +0000)]
Put a CHECK_FOR_INTERRUPTS call into the loops that try to find a unique new
OID or new relfilenode. If the existing OIDs are sufficiently densely
populated, this could take a long time (perhaps even be an infinite loop),
so it seems wise to allow the system to respond to a cancel interrupt here.
Per a gripe from Jacky Leng.
Backpatch as far as 8.1. Older versions just fail on OID collision,
instead of looping.
Tom Lane [Mon, 18 Feb 2008 23:00:32 +0000 (23:00 +0000)]
Remove unnecessary opening of other relation in RI_FKey_keyequal_upd_pk
and RI_FKey_keyequal_upd_fk, as well as no-longer-needed calls of
ri_BuildQueryKeyFull. Aside from saving a few cycles, this avoids needless
deadlock risks when an update is not changing the columns that participate
in an RI constraint. Per a gripe from Alexey Nalbat.
Back-patch to 8.3. Earlier releases did have a need to open the other
relation due to the way in which they retrieved information about the RI
constraint, so this problem unfortunately can't easily be improved pre-8.3.
Bruce Momjian [Mon, 18 Feb 2008 21:46:22 +0000 (21:46 +0000)]
autoconf 2.61's AC_FUNC_FSEEKO reports success/failure differently, so
reorganize code for NetBSD/BSDi port/fseeko.c usage, and make code more
modular.
Michael Meskes [Sun, 17 Feb 2008 18:14:29 +0000 (18:14 +0000)]
- Removed duplicate include of ecpgtype.h which meant I had to adapt all expected results.
- Changed INFORMIX mode symbol definition yet again because the old way didn't work on NetBSD. Hopefully this one does.
Peter Eisentraut [Sun, 17 Feb 2008 16:36:43 +0000 (16:36 +0000)]
Upgrade to Autoconf 2.61:
- Change configure.in to use Autoconf 2.61 and update generated files.
- Update build system and documentation to support now directory variables
offered by Autoconf 2.61.
- Replace usages of PGAC_CHECK_ALIGNOF by AC_CHECK_ALIGNOF, now available
in Autoconf 2.61.
- Drop our patched version of AC_C_INLINE, as Autoconf now has the change.
Tom Lane [Sun, 17 Feb 2008 02:09:32 +0000 (02:09 +0000)]
Replace time_t with pg_time_t (same values, but always int64) in on-disk
data structures and backend internal APIs. This solves problems we've seen
recently with inconsistent layout of pg_control between machines that have
32-bit time_t and those that have already migrated to 64-bit time_t. Also,
we can get out from under the problem that Windows' Unix-API emulation is not
consistent about the width of time_t.
There are a few remaining places where local time_t variables are used to hold
the current or recent result of time(NULL). I didn't bother changing these
since they do not affect any cross-module APIs and surely all platforms will
have 64-bit time_t before overflow becomes an actual risk. time_t should
be avoided for anything visible to extension modules, however.
Tom Lane [Sat, 16 Feb 2008 21:51:04 +0000 (21:51 +0000)]
Update docs to reflect the fact that we can now deal with DST rules
outside the 32-bit-time_t range. Also, refer to Olson's tz database
as the 'zoneinfo' database, a name that upstream sometimes uses, not
'zic database' which they never use.
Tom Lane [Sat, 16 Feb 2008 21:16:04 +0000 (21:16 +0000)]
Update timezone code to track the upstream changes since 2003. In particular
this adds support for 64-bit tzdata files, which is needed to support DST
calculations beyond 2038. Add a regression test case to give some minimal
confidence that that really works.
Bruce Momjian [Sat, 16 Feb 2008 21:03:30 +0000 (21:03 +0000)]
Rename a libpq NOT_USED SSL function to
verify_peer_name_matches_certificate(), clarify some of the function's
variables and logic, and update a comment. This should make SSL
improvements easier in the future.
Tom Lane [Fri, 15 Feb 2008 22:17:06 +0000 (22:17 +0000)]
Allow AS to be omitted when specifying an output column name in SELECT
(or RETURNING), but only when the output name is not any SQL keyword.
This seems as close as we can get to the standard's syntax without a
great deal of thrashing. Original patch by Hiroshi Saito, amended by me.
Tom Lane [Fri, 15 Feb 2008 17:19:46 +0000 (17:19 +0000)]
Remove ancient restriction that LIMIT/OFFSET can't contain a sub-select.
This was probably protecting some implementation limitation when it was
put in, but as far as I can tell the planner and executor have no such
assumption anymore; the case seems to work fine. Per a gripe from
Grzegorz Jaskiewicz.
Tom Lane [Thu, 14 Feb 2008 17:33:37 +0000 (17:33 +0000)]
Sync our regex code with upstream changes since last time we did this, which
was Tcl 8.4.8. The main changes are to remove the never-fully-implemented
code for multi-character collating elements, and to const-ify some stuff a
bit more fully. In combination with the recent security patch, this commit
brings us into line with Tcl 8.5.0.
Note that I didn't make any effort to duplicate a lot of cosmetic changes
that they made to bring their copy into line with their own style
guidelines, such as adding braces around single-line IF bodies. Most of
those we either had done already (such as ANSI-fication of function headers)
or there is no point because pgindent would undo the change anyway.
Tom Lane [Tue, 12 Feb 2008 04:09:44 +0000 (04:09 +0000)]
Fix SPI_cursor_open() and SPI_is_cursor_plan() to push the SPI stack before
doing anything interesting, such as calling RevalidateCachedPlan(). The
necessity of this is demonstrated by an example from Willem Buitendyk:
during a replan, the planner might try to evaluate SPI-using functions,
and so we'd better be in a clean SPI context.
A small downside of this fix is that these two functions will now fail
outright if called when not inside a SPI-using procedure (ie, a
SPI_connect/SPI_finish pair). The documentation never promised or suggested
that that would work, though; and they are normally used in concert with
other functions, mainly SPI_prepare, that always have failed in such a case.
So the odds of breaking something seem pretty low.
In passing, make SPI_is_cursor_plan's error handling convention clearer,
and fix documentation's erroneous claim that SPI_cursor_open would
return NULL on error.
Before 8.3 these functions could not invoke replanning, so there is probably
no need for back-patching.
Tom Lane [Mon, 11 Feb 2008 19:14:30 +0000 (19:14 +0000)]
Repair VACUUM FULL bug introduced by HOT patch: the original way of
calculating a page's initial free space was fine, and should not have been
"improved" by letting PageGetHeapFreeSpace do it. VACUUM FULL is going to
reclaim LP_DEAD line pointers later, so there is no need for a guard
against the page being too full of line pointers, and having one risks
rejecting pages that are perfectly good move destinations.
This also exposed a second bug, which is that the empty_end_pages logic
assumed that any page with no live tuples would get entered into the
fraged_pages list automatically (by virtue of having more free space than
the threshold in the do_frag calculation). This assumption certainly
seems risky when a low fillfactor has been chosen, and even without
tunable fillfactor I think it could conceivably fail on a page with many
unused line pointers. So fix the code to force do_frag true when notup
is true, and patch this part of the fix all the way back.
Tom Lane [Sun, 10 Feb 2008 20:39:08 +0000 (20:39 +0000)]
Fix PageGetExactFreeSpace() so that it actually behaves sensibly
if pd_lower > pd_upper, rather than merely claiming to. This would
only matter if the page header were corrupt, which shouldn't occur,
but ...
Tom Lane [Fri, 8 Feb 2008 17:58:46 +0000 (17:58 +0000)]
Since GSSAPI and SSPI authentication don't work in protocol version 2,
issue a helpful error message instead of sending unparsable garbage.
(It is clearly a design error that this doesn't work, but fixing it
is not worth the trouble at this point.) Per discussion.
Tom Lane [Thu, 7 Feb 2008 22:58:35 +0000 (22:58 +0000)]
Avoid misbehavior in foreign key checks when casting to a datatype for which
the parser supplies a default typmod that can result in data loss (ie,
truncation). Currently that appears to be only CHARACTER and BIT.
We can avoid the problem by specifying the type's internal name instead
of using SQL-spec syntax. Since the queries generated here are only used
internally, there's no need to worry about portability. This problem is
new in 8.3; before we just let the parser do whatever it wanted to resolve
the operator, but 8.3 is trying to be sure that the semantics of FK checks
are consistent. Per report from Harald Fuchs.
Tom Lane [Thu, 7 Feb 2008 21:07:55 +0000 (21:07 +0000)]
Some variants of ALTER OWNER tried to make the "object" field of the
statement be a list of bare C strings, rather than String nodes, which is
what they need to be for copyfuncs/equalfuncs to work. Fortunately these
node types never go out to disk (if they did, we'd likely have noticed the
problem sooner), so we can just fix it without creating a need for initdb.
This bug has been there since 8.0, but 8.3 exposes it in a more common
code path (Parse messages) than prior releases did. Per bug #3940 from
Vladimir Kokovic.