Tom Lane [Tue, 11 Oct 2016 14:08:45 +0000 (10:08 -0400)]
Improve documentation for CREATE RECURSIVE VIEW.
It was perhaps not entirely clear that internal self-references shouldn't
be schema-qualified even if the view name is written with a schema.
Spell it out.
Tom Lane [Mon, 10 Oct 2016 19:11:33 +0000 (15:11 -0400)]
Update user docs for switch to POSIX semaphores.
Since commit ecb0d20a9 hasn't crashed and burned, here's the promised
docs update for it.
In addition to explaining that Linux and FreeBSD ports now use POSIX
semaphores, I did some wordsmithing on pre-existing wording; in
particular trying to clarify which SysV parameters need to be set with
an eye to total usage across all applications.
Andres Freund [Mon, 10 Oct 2016 20:41:57 +0000 (13:41 -0700)]
Make regression tests less dependent on hash table order.
Upcoming changes to the hash table code used, among others, for grouping
and set operations will change the output order for a few queries. To
make it less likely that actual bugs are hidden between regression test
ordering changes, and to make the tests robust against platform
dependant ordering, add ORDER BYs guaranteeing the output order.
As it's possible that some of the changes expose platform dependant
ordering, push this earlier, to let the buildfarm shake out potentially
unstable results.
Tom Lane [Mon, 10 Oct 2016 14:35:58 +0000 (10:35 -0400)]
In PQsendQueryStart(), avoid leaking any left-over async result.
Ordinarily there would not be an async result sitting around at this
point, but it appears that in corner cases there can be. Considering
all the work we're about to launch, it's hardly going to cost anything
noticeable to check.
It's been like this forever, so back-patch to all supported branches.
Tom Lane [Sun, 9 Oct 2016 22:03:45 +0000 (18:03 -0400)]
Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.
We've had support for using unnamed POSIX semaphores instead of System V
semaphores for quite some time, but it was not used by default on any
platform. Since many systems have rather small limits on the number of
SysV semaphores allowed, it seems desirable to switch to POSIX semaphores
where they're available and don't create performance or kernel resource
problems. Experimentation by me shows that unnamed POSIX semaphores
are at least as good as SysV semaphores on Linux, and we previously had
a report from Maksym Sobolyev that FreeBSD is significantly worse with
SysV semaphores than POSIX ones. So adjust those two platforms to use
unnamed POSIX semaphores, if configure can find the necessary library
functions. If this goes well, we may switch other platforms as well,
but it would be advisable to test them individually first.
It's not currently contemplated that we'd encourage users to select
a semaphore API for themselves, but anyone who wants to experiment
can add PREFERRED_SEMAPHORES=UNNAMED_POSIX (or NAMED_POSIX, or SYSV)
to their configure command line to do so.
I also tweaked configure to report which API it's selected, mainly
so that we can tell that from buildfarm reports.
I did not touch the user documentation's discussion about semaphores;
that will need some adjustment once the dust settles.
Tom Lane [Sun, 9 Oct 2016 16:49:37 +0000 (12:49 -0400)]
Fix incorrect handling of polymorphic aggregates used as window functions.
The transfunction was told that its first argument and result were
of the window function output type, not the aggregate state type.
This'd only matter if the transfunction consults get_fn_expr_argtype,
which typically only polymorphic functions would do.
Although we have several regression tests around polymorphic aggs,
none of them detected this mistake --- in fact, they still didn't
fail when I injected the same mistake into nodeAgg.c. So add some
more tests covering both plain agg and window-function-agg cases.
Per report from Sebastian Luque. Back-patch to 9.6 where the error
was introduced (by sloppy refactoring in commit 804163bc2, looks like).
Tom Lane [Sat, 8 Oct 2016 23:29:27 +0000 (19:29 -0400)]
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Tom Lane [Sat, 8 Oct 2016 01:12:25 +0000 (21:12 -0400)]
libpqwalreceiver needs to link with libintl when using --enable-nls.
The need for this was previously obscured even on picky platforms
by the hack we used to support direct cross-module references in
the transforms contrib modules. Now that that hack is gone, the
undefined symbol is exposed, as reported by Robert Haas.
Back-patch to 9.5 where we started to use -Wl,-undefined,dynamic_lookup.
I'm a bit surprised that the older branches don't seem to contain
any gettext references in this module, but since they don't fail
at build time, they must not. (We might be able to get away with
leaving this alone in 9.5/9.6, but I think it's cleaner if the
reference gets resolved at link time.)
Andres Freund [Fri, 7 Oct 2016 23:55:15 +0000 (16:55 -0700)]
Fix fallback implementation of pg_atomic_write_u32().
I somehow had assumed that in the spinlock (in turn possibly using
semaphores) based fallback atomics implementation 32 bit writes could be
done without a lock. As far as the write goes that's correct, since
postgres supports only platforms with single-copy atomicity for aligned
32bit writes. But writing without holding the spinlock breaks
read-modify-write operations like pg_atomic_compare_exchange_u32(),
since they'll potentially "miss" a concurrent write, which can't happen
in actual hardware implementations.
In 9.6+ when using the fallback atomics implementation this could lead
to buffer header locks not being properly marked as released, and
potentially some related state corruption. I don't see a related danger
in 9.5 (earliest release with the API), because pg_atomic_write_u32()
wasn't used in a concurrent manner there.
The state variable of local buffers, before this change, were
manipulated using pg_atomic_write_u32(), to avoid unnecessary
synchronization overhead. As that'd not be the case anymore, introduce
and use pg_atomic_unlocked_write_u32(), which does not correctly
interact with RMW operations.
This bug only caused issues when postgres is compiled on platforms
without atomics support (i.e. no common new platform), or when compiled
with --disable-atomics, which explains why this wasn't noticed in
testing.
Reported-By: Tom Lane
Discussion: <14947.1475690465@sss.pgh.pa.us>
Backpatch: 9.5-, where the atomic operations API was introduced.
Remove bogus mapping from UTF-8 to SJIS conversion table.
0xc19c is not a valid UTF-8 byte sequence. It doesn't do any harm, AFAICS,
but it's surely not intentional. No backpatching though, just to be sure.
In the passing, also add a file header comment to the file, like the
UCS_to_SJIS.pl script would produce. (The file was originally created with
UCS_to_SJIS.pl, but has been modified by hand since then. That's
questionable, but I'll leave fixing that for later.)
Make TAP test suites to work, when @INC does not contain current dir.
Recent Perl and/or new Linux distributions are starting to remove "." from
the @INC list by default. That breaks pg_rewind and ssl test suites, which
use helper perl modules that reside in the same directory. To fix, add the
current source directory explicitly to prove's include dir.
The vcregress.pl script probably also needs something like this, but I
wasn't able to remove '.' from @INC on Windows to test this, and don't want
to try doing that blindly.
Tom Lane [Fri, 7 Oct 2016 15:27:34 +0000 (11:27 -0400)]
Fix python shlib probe for Cygwin.
On buildfarm member cockatiel, that library is in /usr/bin.
(Possibly we should look at $PATH on this platform?)
Per off-list report from Andrew Dunstan.
Tom Lane [Fri, 7 Oct 2016 13:51:18 +0000 (09:51 -0400)]
Fix pg_dump to work against pre-9.0 servers again.
getBlobs' queries for pre-9.0 servers were broken in two ways:
the 7.x/8.x query uses DISTINCT so it can't have unspecified-type
NULLs in the target list, and both that query and the 7.0 one
failed to provide the correct output column labels, so that the
subsequent code to extract data from the PGresult would fail.
Back-patch to 9.6 where the breakage was introduced (by commit 23f34fa4b).
Clear OpenSSL error queue after failed X509_STORE_load_locations() call.
Leaving the error in the error queue used to be harmless, because the
X509_STORE_load_locations() call used to be the last step in
initialize_SSL(), and we would clear the queue before the next
SSL_connect() call. But previous commit moved things around. The symptom
was that if a CRL file was not found, and one of the subsequent
initialization steps, like loading the client certificate or private key,
failed, we would incorrectly print the "no such file" error message from
the earlier X509_STORE_load_locations() call as the reason.
Backpatch to all supported versions, like the previous patch.
1. There was a race condition, if two threads opened a connection at the
same time. We used a mutex around SSL_CTX_* calls, but that was not
enough, e.g. if one thread SSL_CTX_load_verify_locations() with one
path, and another thread set it with a different path, before the first
thread got to establish the connection.
2. Opening two different connections, with different sslrootcert settings,
seemed to fail outright with "SSL error: block type is not 01". Not sure
why.
3. We created the SSL object, before calling SSL_CTX_load_verify_locations
and SSL_CTX_use_certificate_chain_file on the SSL context. That was
wrong, because the options set on the SSL context are propagated to the
SSL object, when the SSL object is created. If they are set after the
SSL object has already been created, they won't take effect until the
next connection. (This is bug #14329)
At least some of these could've been fixed while still using a shared
context, but it would've been more complicated and error-prone. To keep
things simple, let's just use a separate SSL context for each connection,
and accept the overhead.
If you point pg_rewind to a server that is using synchronous replication,
with "pg_rewind --source-server=...", and the replication is not working
for some reason, pg_rewind will get stuck because it creates a temporary
table, which needs to be replicated. You could call broken replication a
pilot error, but pg_rewind is often used in special circumstances, when
there are changes to the replication setup.
We don't do any "real" updates, and we don't care about fsyncing or
replicating the operations on the temporary tables, so fix that by
setting synchronous_commit off.
Michael Banck, Michael Paquier. Backpatch to 9.5, where pg_rewind was
introduced.
Fix excessive memory consumption in the new sort pre-reading code.
LogicalTapeRewind() should not allocate large read buffer, if the tape
is completely empty. The calling code relies on that, for its
calculation of how much memory to allocate for the read buffers. That
lead to massive overallocation of memory, if maxTapes was high, but
only a few tapes were actually used.
Tom Lane [Thu, 6 Oct 2016 03:03:55 +0000 (23:03 -0400)]
Remove -Wl,-undefined,dynamic_lookup in macOS build.
We don't need this anymore, and it prevents build-time error checking
that's usually good to have, so remove it. Undoes one change of commit cac765820.
Unfortunately, it's much harder to get a similar effect on other common
platforms, because we don't want the linker to throw errors for symbols
that will be resolved in the core backend. Only macOS and AIX expect the
core backend executable to be available while linking loadable modules,
so only these platforms can usefully throw errors for unresolved symbols
at link time.
Tom Lane [Thu, 6 Oct 2016 02:47:23 +0000 (22:47 -0400)]
Try to fix python shlib probe for MinGW.
Per discussion with Andrew Dunstan, it seems there are three peculiarities
of the Python installation on MinGW (or at least, of the instance on
buildfarm member frogmouth). First, the library name doesn't contain
"2.7" but just "27". It looks like we can deal with that by consulting
get_config_vars('VERSION') to see whether a dot should be used or not.
Second, the library might be in c:/Windows/System32, analogously to it
possibly being in /usr/lib on Unix-oid platforms. Third, it's apparently
not standard to use the prefix "lib" on the file name. This patch will
accept files with or without "lib", but the first part of that may well
be dead code.
Tom Lane [Wed, 5 Oct 2016 15:44:57 +0000 (11:44 -0400)]
In python shlib probe, cater for OpenBSD, which omits the .so symlink.
Most Unix-oid platforms provide a symlink "libfoo.so" -> "libfoo.so.n.n"
to allow the linker to resolve a reference "-lfoo" to a version-numbered
shared library. OpenBSD has apparently hacked ld(1) to do this internally,
because there are no such symlinks to be found in their library
directories. Tweak the new code in PGAC_CHECK_PYTHON_EMBED_SETUP to cope.
Per buildfarm member curculio.
Robert Haas [Wed, 5 Oct 2016 12:04:52 +0000 (08:04 -0400)]
Rename WAIT_* constants to PG_WAIT_*.
Windows apparently has a constant named WAIT_TIMEOUT, and some of these
other names are pretty generic, too. Insert "PG_" at the front of each
name in order to disambiguate.
Tom Lane [Tue, 4 Oct 2016 21:49:07 +0000 (17:49 -0400)]
Avoid direct cross-module links in hstore_plperl and ltree_plpython, too.
Just turning the crank on the project started in commit d51924be8.
These cases turn out to be exact subsets of the boilerplate needed
for hstore_plpython.
Tom Lane [Tue, 4 Oct 2016 19:23:02 +0000 (15:23 -0400)]
Improve (I hope) our autolocation of the Python shared library.
Older versions of Python produce garbage (or at least useless) values of
get_config_vars('LDLIBRARY'). Newer versions produce garbage (or at least
useless) values of get_config_vars('SO'), which was defeating our configure
logic that attempted to identify where the Python shlib really is. The net
result, at least with a stock Python 3.5 installation on macOS, was that
we were linking against a static library in the mistaken belief that it was
a shared library. This managed to work, if you count statically absorbing
libpython into plpython.so as working. But it no longer works as of commit d51924be8, because now we get separate static copies of libpython in
plpython.so and hstore_plpython.so, and those can't interoperate on the
same data. There are some other infelicities like assuming that nobody
ever installs a private version of Python on a macOS machine.
Hence, forget about looking in $python_configdir for the Python shlib;
as far as I can tell no version of Python has ever put one there, and
certainly no currently-supported version does. Also, rather than relying
on get_config_vars('SO'), just try all the possibilities for shlib
extensions. Also, rather than trusting Py_ENABLE_SHARED, believe we've
found a shlib only if it has a recognized extension. Last, explicitly
cope with the possibility that the shlib is really in /usr/lib and
$python_libdir is a red herring --- this is the actual situation on older
macOS, but we were only accidentally working with it.
Robert Haas [Tue, 4 Oct 2016 15:49:09 +0000 (11:49 -0400)]
Remove trailing commas from enums.
Buildfarm member mylodon doesn't like them. Actually, I don't like
them either, but I failed to notice these before pushing commit 6f3bd98ebfc008cbd676da777bb0b2376c4c4bfa.
Robert Haas [Tue, 4 Oct 2016 14:50:13 +0000 (10:50 -0400)]
Extend framework from commit 53be0b1ad to report latch waits.
WaitLatch, WaitLatchOrSocket, and WaitEventSetWait now taken an
additional wait_event_info parameter; legal values are defined in
pgstat.h. This makes it possible to uniquely identify every point in
the core code where we are waiting for a latch; extensions can pass
WAIT_EXTENSION.
Because latches were the major wait primitive not previously covered
by this patch, it is now possible to see information in
pg_stat_activity on a large number of important wait events not
previously addressed, such as ClientRead, ClientWrite, and SyncRep.
Unfortunately, many of the wait events added by this patch will fail
to appear in pg_stat_activity because they're only used in background
processes which don't currently appear in pg_stat_activity. We should
fix this either by creating a separate view for such information, or
else by deciding to include them in pg_stat_activity after all.
Michael Paquier and Robert Haas, reviewed by Alexander Korotkov and
Thomas Munro.
Tom Lane [Tue, 4 Oct 2016 13:38:43 +0000 (09:38 -0400)]
Fix hstore_plpython for Python 3.
In commit d51924be8, I overlooked the need to provide linkage for
PLyUnicode_FromStringAndSize, because that's only used (and indeed
only exists) in Python 3 builds.
In light of the need to #if this item, rearrange the ordering of
the code related to each function pointer, so as not to need more
#if's than absolutely necessary.
Andres Freund [Tue, 4 Oct 2016 05:11:36 +0000 (22:11 -0700)]
Correct logical decoding restore behaviour for subtransactions.
Before initializing iteration over a subtransaction's changes, the last
few changes were not spilled to disk. That's correct if the transaction
didn't spill to disk, but otherwise... This bug can lead to missed or
misorderd subtransaction contents when they were spilled to disk.
Move spilling of the remaining in-memory changes to
ReorderBufferIterTXNInit(), where it can easily be applied to the top
transaction and, if present, subtransactions.
Since this code had too many bugs already, noticeably increase test
coverage.
Fixes: #14319 Reported-By: Huan Ruan
Discussion: <20160909012610.20024.58169@wrigleys.postgresql.org>
Backport: 9,4-, where logical decoding was added
Tom Lane [Tue, 4 Oct 2016 02:27:11 +0000 (22:27 -0400)]
Convert contrib/hstore_plpython to not use direct linking to other modules.
Previously, on most platforms, we allowed hstore_plpython's references
to hstore and plpython to be unresolved symbols at link time, trusting
the dynamic linker to resolve them when the module is loaded. This
has a number of problems, the worst being that the dynamic linker
does not know where the references come from and can do nothing but
fail if those other modules haven't been loaded. We've more or less
gotten away with that for the limited use-case of datatype transform
modules, but even there, it requires some awkward hacks, most recently
commit 83c249200.
Instead, let's not treat these references as linker-resolvable at all,
but use function pointers that are manually filled in by the module's
_PG_init function. There are few enough contact points that this
doesn't seem unmaintainable, at least for these use-cases. (Note that
the same technique wouldn't work at all for decoupling from libpython
itself, but fortunately that's just a standard shared library and can
be linked to normally.)
This is an initial patch that just converts hstore_plpython. If the
buildfarm doesn't find any fatal problems, I'll work on the other
transform modules soon.
Tom Lane [Mon, 3 Oct 2016 20:40:05 +0000 (16:40 -0400)]
Show a sensible value in pg_settings.unit for GUC_UNIT_XSEGS variables.
Commit 88e982302 invented GUC_UNIT_XSEGS for min_wal_size and max_wal_size,
but neglected to make it display sensibly in pg_settings.unit (by adding a
case to the switch in GetConfigOptionByNum). Fix that, and adjust said
switch to throw a run-time error the next time somebody forgets.
In passing, avoid using a static buffer for the output string --- the rest
of this function pstrdup's from a local buffer, and I see no very good
reason why the units code should do it differently and less safely.
Per report from Otar Shavadze. Back-patch to 9.5 where the new unit type
was added.
Stephen Frost [Mon, 3 Oct 2016 20:22:57 +0000 (16:22 -0400)]
Fix RLS with COPY (col1, col2) FROM tab
Attempting to COPY a subset of columns from a table with RLS enabled
would fail due to an invalid query being constructed (using a single
ColumnRef with the list of fields to exact in 'fields', but that's for
the different levels of an indirection for a single column, not for
specifying multiple columns).
Correct by building a ColumnRef and then RestTarget for each column
being requested and then adding those to the targetList for the select
query. Include regression tests to hopefully catch if this is broken
again in the future.
Patch-By: Adam Brightwell Reviewed-By: Michael Paquier
Tom Lane [Mon, 3 Oct 2016 14:07:39 +0000 (10:07 -0400)]
Enforce a specific order for probing library loadability in pg_upgrade.
pg_upgrade checks whether all the shared libraries used in the old cluster
are also available in the new one by issuing LOAD for each library name.
Previously, it cared not what order it did the LOADs in. Ideally it
should not have to care, but currently the transform modules in contrib
fail unless both the language and datatype modules they depend on are
loaded first. A backend-side solution for that looks possible but
probably not back-patchable, so as a stopgap measure, let's do the LOAD
tests in order by library name length. That should fix the problem for
reasonably-named transform modules, eg "hstore_plpython" will be loaded
after both "hstore" and "plpython". (Yeah, it's a hack.)
In a larger sense, having a predictable order of these probes is a good
thing, since it will make upgrades predictably work or not work in the
face of inter-library dependencies. Also, this patch replaces O(N^2)
de-duplication logic with O(N log N) logic, which could matter in
installations with very many databases. So I don't foresee reverting this
even after we have a proper fix for the library-dependency problem.
In passing, improve a couple of SQL queries used here.
Per complaint from Andrew Dunstan that pg_upgrade'ing the transform contrib
modules failed. Back-patch to 9.5 where transform modules were introduced.
Change the way pre-reading in external sort's merge phase works.
Don't pre-read tuples into SortTuple slots during merge. Instead, use the
memory for larger read buffers in logtape.c. We're doing the same number
of READTUP() calls either way, but managing the pre-read SortTuple slots
is much more complicated. Also, the on-tape representation is more compact
than SortTuples, so we can fit more pre-read tuples into the same amount
of memory this way. And we have better cache-locality, when we use just a
small number of SortTuple slots.
Now that we only hold one tuple from each tape in the SortTuple slots, we
can greatly simplify the "batch memory" management. We now maintain a
small set of fixed-sized slots, to hold the tuples, and fall back to
palloc() for larger tuples. We use this method during all merge phases,
not just the final merge, and also when randomAccess is requested, and
also in the TSS_SORTEDONTAPE case. In other words, it's used whenever we
do an external sort.
Tom Lane [Sun, 2 Oct 2016 18:31:28 +0000 (14:31 -0400)]
Add ALTER EXTENSION ADD/DROP ACCESS METHOD, and use it in pg_upgrade.
Without this, an extension containing an access method is not properly
dumped/restored during pg_upgrade --- the AM ends up not being a member
of the extension after upgrading.
Another oversight in commit 473b93287, reported by Andrew Dunstan.
Tom Lane [Sun, 2 Oct 2016 16:33:46 +0000 (12:33 -0400)]
Avoid leaking FDs after an fsync failure.
Fixes errors introduced in commit bc34223bc, as detected by Coverity.
In passing, report ENOSPC for a short write while padding a new wal file in
open_walfile, make certain that close_walfile closes walfile in all cases,
and improve a couple of comments.
Tom Lane [Sat, 1 Oct 2016 21:15:09 +0000 (17:15 -0400)]
Do ClosePostmasterPorts() earlier in SubPostmasterMain().
In standard Unix builds, postmaster child processes do ClosePostmasterPorts
immediately after InitPostmasterChild, that is almost immediately after
being spawned. This is important because we don't want children holding
open the postmaster's end of the postmaster death watch pipe.
However, in EXEC_BACKEND builds, SubPostmasterMain was postponing this
responsibility significantly, in order to make it slightly more convenient
to pass the right flag value to ClosePostmasterPorts. This is bad,
particularly seeing that process_shared_preload_libraries() might invoke
nearly-arbitrary code. Rearrange so that we do it as soon as we've
fetched the socket FDs via read_backend_variables().
Also move the comment explaining about randomize_va_space to before the
call of PGSharedMemoryReAttach, which is where it's relevant. The old
placement was appropriate when the reattach happened inside
CreateSharedMemoryAndSemaphores, but that was a long time ago.
Back-patch to 9.3; the patch doesn't apply cleanly before that, and
it doesn't seem worth a lot of effort given that we've had no actual
field complaints traceable to this.
Tom Lane [Sat, 1 Oct 2016 20:32:54 +0000 (16:32 -0400)]
Fix bugs in contrib/pg_visibility.
collect_corrupt_items() failed to initialize tuple.t_self. While
HeapTupleSatisfiesVacuum() doesn't actually use that value, it does
Assert that it's valid, so that the code would dump core if ip_posid
chanced to be zero. (That's somewhat unlikely, which probably explains
how this got missed. In any case it wouldn't matter for field use.)
Also, collect_corrupt_items was returning the wrong TIDs, that is the
contents of t_ctid rather than the tuple's own location. This would
be the same thing in simple cases, but it could be wrong if, for
example, a past update attempt had been rolled back, leaving a live
tuple whose t_ctid doesn't point at itself.
Also, in pg_visibility(), guard against trying to read a page past
the end of the rel. The VM code handles inquiries beyond the end
of the map by silently returning zeroes, and it seems like we should
do the same thing here.
I ran into the assertion failure while using pg_visibility to check
pg_upgrade's behavior, and then noted the other problems while
reading the code.
Tom Lane [Sat, 1 Oct 2016 17:45:16 +0000 (13:45 -0400)]
Fix misstatement in comment in Makefile.shlib.
There is no need for "all: all-lib" to be placed before inclusion of
Makefile.shlib. Makefile.global is what ensures that "all" is the
default target, and we already document that that has to be included
first. Per comment from Pavel Raiskup.
Tom Lane [Sat, 1 Oct 2016 17:35:13 +0000 (13:35 -0400)]
Fix misplacement of submake-generated-headers prerequisites.
The sequence "configure; cd src/pl/plpython; make -j" failed due to
trying to compile plpython's .o files before the generated headers
finished building. (This is an important real-world case, since it's
the typical second step when building both plpython2 and plpython3.)
This happens because the submake-generated-headers target is not
placed in a way to make it a prerequisite to compiling the .o files.
Fix that.
Checking other uses of submake-generated-headers, I noted that the one
attached to pg_regress was similarly misplaced; but it's actually not
needed at all for pg_regress.o, rather regress.o, so move it to be a
prerequisite of that.
Back-patch to 9.6 where submake-generated-headers was introduced
(by commit 548af97fc). It's not immediately clear to me why the
previous coding didn't have the same issue; but since we've not
had field reports of plpython make failing, leave it alone in the
older branches.
Peter Eisentraut [Fri, 30 Sep 2016 16:00:00 +0000 (12:00 -0400)]
Set log_line_prefix and application name in test drivers
Before pg_regress runs psql, set the application name to the test name.
Similarly, set the application name to the test file name in the TAP
tests. Also, set a default log_line_prefix that show the application
name, as well as the PID and a time stamp.
That way, the server log output can be correlated to the test input
files, making debugging a bit easier.
Tom Lane [Sat, 1 Oct 2016 00:40:27 +0000 (20:40 -0400)]
Improve error reporting in pg_upgrade's file copying/linking/rewriting.
The previous design for this had copyFile(), linkFile(), and
rewriteVisibilityMap() returning strerror strings, with the caller
producing one-size-fits-all error messages based on that. This made it
impossible to produce messages that described the failures with any degree
of precision, especially not short-read problems since those don't set
errno at all.
Since pg_upgrade has no intention of continuing after any error in this
area, let's fix this by just letting these functions call pg_fatal() for
themselves, making it easy for each point of failure to have a suitable
error message. Taking this approach also allows dropping cleanup code
that was unnecessary and was often rather sloppy about preserving errno.
To not lose relevant info that was reported before, pass in the schema name
and table name of the current table so that they can be included in the
error reports.
An additional problem was the use of getErrorText(), which was flat out
wrong for all but a couple of call sites, because it unconditionally did
"_dosmaperr(GetLastError())" on Windows. That's only appropriate when
reporting an error from a Windows-native API, which only a couple of
the callers were actually doing. Thus, even the reported strerror string
would be unrelated to the actual failure in many cases on Windows.
To fix, get rid of getErrorText() altogether, and just have call sites
do strerror(errno) instead, since that's the way all the rest of our
frontend programs do it. Add back the _dosmaperr() calls in the two
places where that's actually appropriate.
In passing, make assorted messages hew more closely to project style
guidelines, notably by removing initial capitals in not-complete-sentence
primary error messages. (I didn't make any effort to clean up places
I didn't have another reason to touch, though.)
Per discussion of a report from Thomas Kellerer. Back-patch to 9.6,
but no further; given the relative infrequency of reports of problems
here, it's not clear it's worth adapting the patch to older branches.
Patch by me, but with credit to Alvaro Herrera for spotting the issue
with getErrorText's misuse of _dosmaperr().
Tom Lane [Sat, 1 Oct 2016 00:39:00 +0000 (20:39 -0400)]
Fix multiple portability issues in pg_upgrade's rewriteVisibilityMap().
This is new code in 9.6, and evidently we missed out testing it as
thoroughly as it should have been. Bugs fixed here:
1. Use binary not text mode to open the files on Windows. Before, if
the visibility map chanced to contain two bytes that looked like \r\n,
Windows' read() would convert that to \n, which both corrupts the map
data and causes the file to look shorter than it should. Unless you
were *very* unlucky and had an exact multiple of 8K such occurrences
in each VM file, this would cause pg_upgrade to report a failure,
though with a rather obscure error message.
2. The code for copying rebuilt bytes into the output was simply wrong.
It chanced to work okay on little-endian machines but would emit the
bytes in the wrong order on big-endian, leading to silent corruption
of the visibility map data.
3. The code was careless about alignment of the working buffers. Given
all three of an alignment-picky architecture, a compiler that chooses
to put the new_vmbuf[] local variable at an odd starting address, and
a checksum-enabled database, pg_upgrade would dump core.
Point one was reported by Thomas Kellerer, the other two detected by
code-reading.
Point two is much the nastiest of these issues from an impact standpoint,
though fortunately it affects only a minority of users. The Windows issue
will definitely bite people, but it seems quite unlikely that there would
be undetected corruption from that.
In addition, I failed to resist the temptation to do some minor cosmetic
adjustments, mostly improving the comments.
It would be a good idea to try to improve the error reporting here, but
that seems like material for a separate patch.
Magnus Hagander [Fri, 30 Sep 2016 09:19:30 +0000 (11:19 +0200)]
Retry opening new segments in pg_xlogdump --folllow
There is a small window between when the server closes out the existing
segment and the new one is created. Put a loop around the open call in
this case to make sure we wait for the new file to actually appear.
Stephen Frost [Fri, 30 Sep 2016 02:13:38 +0000 (22:13 -0400)]
Remove superuser checks in pgstattuple
Now that we track initial privileges on extension objects and changes to
those permissions, we can drop the superuser() checks from the various
functions which are part of the pgstattuple extension and rely on the
GRANT system to control access to those functions.
Since a pg_upgrade will preserve the version of the extension which
existed prior to the upgrade, we can't simply modify the existing
functions but instead need to create new functions which remove the
checks and update the SQL-level functions to use the new functions
(and to REVOKE EXECUTE rights on those functions from PUBLIC).
Thanks to Tom and Andres for adding support for extensions to follow
update paths (see: 40b449a), allowing this patch to be much smaller
since no new base version script needed to be included.
Tom Lane [Thu, 29 Sep 2016 17:32:27 +0000 (13:32 -0400)]
Allow contrib/file_fdw to read from a program, like COPY FROM PROGRAM.
This patch just exposes COPY's FROM PROGRAM option in contrib/file_fdw.
There don't seem to be any security issues with that that are any worse
than what already exist with file_fdw and COPY; as in the existing cases,
only superusers are allowed to control what gets executed.
A regression test case might be nice here, but choosing a 100% portable
command to run is hard. (We haven't got a test for COPY FROM PROGRAM
itself, either.)
Corey Huinker and Adam Gomaa, reviewed by Amit Langote
Don't bother to lock bufmgr partitions in pg_buffercache.
That makes the view a lot less disruptive to use on a production system.
Without the locks, you don't get a consistent snapshot across all buffers,
but that's OK. It wasn't a very useful guarantee in practice.
Ivan Kartyshov, reviewed by Tomas Vondra and Robert Haas.
Peter Eisentraut [Wed, 28 Sep 2016 16:00:00 +0000 (12:00 -0400)]
Exclude additional directories in pg_basebackup
The list of files and directories that pg_basebackup excludes from the
backup was somewhat incomplete and unorganized. Change that with having
the exclusion driven from tables. Clean up some code around it. Also
document the exclusions in more detail so that users of pg_start_backup
can make use of it as well.
The contents of these directories are now excluded from the backup:
pg_dynshmem, pg_notify, pg_serial, pg_snapshots, pg_subtrans
Also fix a bug that a pg_repl_slot or pg_stat_tmp being a symlink would
cause a corrupt tar header to be created. Now such symlinks are
included in the backup as empty directories. Bug found by Ashutosh
Sharma <ashu.coek88@gmail.com>.
From: David Steele <david@pgmasters.net> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Tom Lane [Wed, 28 Sep 2016 21:08:40 +0000 (17:08 -0400)]
Rationalize format-picture caching logic in formatting.c.
Add a validity flag to DCHCacheEntry and NUMCacheEntry entries, and
do not set it true until after we've parsed the supplied format string.
This allows dealing with possible errors while parsing the format
without the baroque hack that was there before (which only covered
errors within NUMDesc_prepare, anyway). We can get rid of the PG_TRY in
NUMDesc_prepare, as well as last_NUMCacheEntry and NUM_cache_remove.
(Essentially, this reverts commit ff783fbae in favor of a less fragile
solution; the problems with that approach are well illustrated by later
hacking such as 55f927a46.)
In passing, define the size of these caches as DCH_CACHE_ENTRIES not
DCH_CACHE_FIELDS + 1 (whoever thought that was a good definition?)
and likewise for the NUM cache. Also const-ify format string parameters
where convenient, and merge duplicated cache lookup logic.
This is primarily driven by a proposed patch from Artur Zakirov,
which introduced some ereport's into format string parsing for
the datetime case. He proposed preventing the creation of invalid
cache entries by parsing the format string first into a local-variable
array, and then copying that to a cache entry. That seemed a bit
ugly to me, and anyway randomly different from the way the identical
problem had been solved for the numeric case. Let's make the two
sets of code more similar not less so.
I'm not sure whether we'll adopt the new error conditions Artur proposes,
but this patch seems like good code cleanup and future-proofing in any
case. The existing code is critically (and undocumented-ly) dependent on
no elog being thrown out of several nontrivial functions, which is trouble
waiting to happen, though it doesn't seem to be actively broken today.
Tom Lane [Wed, 28 Sep 2016 18:36:04 +0000 (14:36 -0400)]
Make to_timestamp() and to_date() range-check fields of their input.
Historically, something like to_date('2009-06-40','YYYY-MM-DD') would
return '2009-07-10' because there was no prohibition on out-of-range
month or day numbers. This has been widely panned, and it also turns
out that Oracle throws an error in such cases. Since these functions
are nominally Oracle-compatibility features, let's change that.
There's no particular restriction on year (modulo the fact that the
scanner may not believe that more than 4 digits are year digits,
a matter to be addressed separately if at all). But we now check month,
day, hour, minute, second, and fractional-second fields, as well as
day-of-year and second-of-day fields if those are used.
Currently, no checks are made on ISO-8601-style week numbers or day
numbers; it's not very clear what the appropriate rules would be there,
and they're probably so little used that it's not worth sweating over.
Artur Zakirov, reviewed by Amul Sul, further adjustments by me
Peter Eisentraut [Wed, 28 Sep 2016 16:00:00 +0000 (12:00 -0400)]
Fix CRC check handling in get_controlfile
The previous patch broke this by returning NULL for a failed CRC check,
which pg_controldata would then try to read. Fix by returning the
result of the CRC check in a separate argument.
Robert Haas [Wed, 28 Sep 2016 15:19:46 +0000 (11:19 -0400)]
Fix dangling pointer problem in ReorderBufferSerializeChange.
Commit 3fe3511d05127cc024b221040db2eeb352e7d716 introduced a new
case into this function, but neglected to ensure that the "ondisk"
pointer got updated after a possible reallocation as the code does
in other cases.
Stas Kelvich, per diagnosis by Konstantin Knizhnik.
This makes the parameter easier to extend, to support other password-based
authentication protocols than MD5. (SCRAM is being worked on.)
The GUC still accepts on/off as aliases for "md5" and "plain", although
we may want to remove those once we actually add support for another
password hash type.
Michael Paquier, reviewed by David Steele, with some further edits by me.
Tom Lane [Tue, 27 Sep 2016 22:43:36 +0000 (18:43 -0400)]
Disallow pushing volatile quals past set-returning functions.
Pushing an upper-level restriction clause into an unflattened
subquery-in-FROM is okay when the subquery contains no SRFs in its
targetlist, or when it does but the SRFs are unreferenced by the clause
*and the clause is not volatile*. Otherwise, we're changing the number
of times the clause is evaluated, which is bad for volatile quals, and
possibly changing the result, since a volatile qual might succeed for some
SRF output rows and not others despite not referencing any of the changing
columns. (Indeed, if the clause is something like "random() > 0.5", the
user is probably expecting exactly that behavior.)
We had most of these restrictions down, but not the one about the upper
clause not being volatile. Fix that, and add a regression test to
illustrate the expected behavior.
Although this is definitely a bug, it doesn't seem like back-patch
material, since possibly some users don't realize that the broken
behavior is broken and are relying on what happens now. Also, while
the added test is quite cheap in the wake of commit a4c35ea1c, it would
be much more expensive (or else messier) in older branches.
Tom Lane [Tue, 27 Sep 2016 18:29:12 +0000 (14:29 -0400)]
Make struct ParallelSlot private within pg_dump/parallel.c.
The only field of this struct that other files have any need to touch
is the pointer to the TocEntry a worker is working on. (Well,
pg_backup_archiver.c is actually looking at workerStatus too, but that
can be finessed by specifying that the TocEntry pointer is NULL for a
non-busy worker.)
Hence, move out the TocEntry pointers to a separate array within
struct ParallelState, and then we can make struct ParallelSlot private.
I noted the possibility of this previously, but hadn't got round to
actually doing it.
Tom Lane [Tue, 27 Sep 2016 17:56:04 +0000 (13:56 -0400)]
Rationalize parallel dump/restore's handling of worker cmd/status messages.
The existing APIs for creating and parsing command and status messages are
rather messy; for example, archive-format modules have to provide code
for constructing command messages, which is entirely pointless since
the code to read them is hard-wired in WaitForCommands() and hence
no format-specific variation is actually possible. But there's little
foreseeable reason to need format-specific variation anyway.
The situation for status messages is no better; at least those are both
constructed and parsed by format-specific code, but said code is quite
redundant since there's no actual need for format-specific variation.
To add insult to injury, the first API involves returning pointers to
static buffers, which is bad, while the second involves returning pointers
to malloc'd strings, which is safer but randomly inconsistent.
Hence, get rid of the MasterStartParallelItem and MasterEndParallelItem
APIs, and instead write centralized functions that construct and parse
command and status messages. If we ever do need more flexibility, these
functions can be the standard implementations of format-specific
callback methods, but that's a long way off if it ever happens.
The ListenToWorkers/ReapWorkerStatus APIs were messy and hard to use.
Instead, make DispatchJobForTocEntry register a callback function that
will take care of state cleanup, doing whatever had been done by the caller
of ReapWorkerStatus in the old design. (This callback is essentially just
the old mark_work_done function in the restore case, and a trivial test for
worker failure in the dump case.) Then we can have ListenToWorkers call
the callback immediately on receipt of a status message, and return the
worker to WRKR_IDLE state; so the WRKR_FINISHED state goes away.
This allows us to design a unified wait-for-worker-messages loop:
WaitForWorkers replaces EnsureIdleWorker and EnsureWorkersFinished as well
as the mess in restore_toc_entries_parallel. Also, we no longer need the
fragile API spec that the caller of DispatchJobForTocEntry is responsible
for ensuring there's an idle worker, since DispatchJobForTocEntry can just
wait until there is one.
In passing, I got rid of the ParallelArgs struct, which was a net negative
in terms of notational verboseness, and didn't seem to be providing any
noticeable amount of abstraction either.
Tom Lane [Tue, 27 Sep 2016 15:38:33 +0000 (11:38 -0400)]
Improve contrib/cube's handling of zero-D cubes, infinities, and NaNs.
It's always been possible to create a zero-dimensional cube by converting
from a zero-length float8 array, but cube_in failed to accept the '()'
representation that cube_out produced for that case, resulting in a
dump/reload hazard. Make it accept the case. Also fix a couple of
other places that didn't behave sanely for zero-dimensional cubes:
cube_size would produce 1.0 when surely the answer should be 0.0,
and g_cube_distance risked a divide-by-zero failure.
Likewise, it's always been possible to create cubes containing float8
infinity or NaN coordinate values, but cube_in couldn't parse such input,
and cube_out produced platform-dependent spellings of the values. Convert
them to use float8in_internal and float8out_internal so that the behavior
will be the same as for float8, as we recently did for the core geometric
types (cf commit 50861cd68). As in that commit, I don't pretend that this
patch fixes all insane corner-case behaviors that may exist for NaNs, but
it's a step forward.
(This change allows removal of the separate cube_1.out and cube_3.out
expected-files, as the platform dependency that previously required them
is now gone: an underflowing coordinate value will now produce an error
not plus or minus zero.)
Make errors from cube_in follow project conventions as to spelling
("invalid input syntax for cube" not "bad cube representation")
and errcode (INVALID_TEXT_REPRESENTATION not SYNTAX_ERROR).
Also a few marginal code cleanups and comment improvements.
<sys/select.h> is required by POSIX.1-2001 to get the prototype of
select(2), but nearly no systems enforce that because older standards
let you get away with including some other headers. Recent OpenBSD
hacking has removed that frail touch of friendliness, however, which
broke some compiles; fix all the way back to 9.1 by adding the required
standard. Only vacuumdb.c was reported to fail, but it seems easier to
fix the whole lot in a fell swoop.
Tom Lane [Tue, 27 Sep 2016 00:23:50 +0000 (20:23 -0400)]
Fix newly-introduced issues in pgbench.
The result of FD_ISSET() doesn't necessarily fit in a bool, though
assigning it to one might accidentally work depending on platform and which
socket FD number is being inquired of. Rewrite to test it with if(),
rather than making any specific assumption about the result width,
to match the way every other such call in PG is written.
Don't break out of the input_mask-filling loop after finding the first
client that we're waiting for results from. That mostly breaks parallel
query management.
Also, if we choose not to call select(), be sure to clear out any bits
the mask-filling loop might have set, so that we don't accidentally call
doCustom for clients we don't know have input. Doing so would likely
be harmless, but it's a waste of cycles and doesn't seem to be intended.
Make this_usec wide enough. (Yeah, the value would usually fit in an
int, but then why are we using int64 everywhere else?)
Minor cosmetic adjustments, mostly comment improvements.
Problems introduced by commit 12788ae49. The first issue was discovered
by buildfarm testing, the others by code review.
Tom Lane [Mon, 26 Sep 2016 18:52:44 +0000 (14:52 -0400)]
Replace the built-in GIN array opclasses with a single polymorphic opclass.
We had thirty different GIN array opclasses sharing the same operators and
support functions. That still didn't cover all the built-in types, nor
did it cover arrays of extension-added types. What we want is a single
polymorphic opclass for "anyarray". There were two missing features needed
to make this possible:
1. We have to be able to declare the index storage type as ANYELEMENT
when the opclass is declared to index ANYARRAY. This just takes a few
more lines in index_create(). Although this currently seems of use only
for GIN, there's no reason to make index_create() restrict it to that.
2. We have to be able to identify the proper GIN compare function for
the index storage type. This patch proceeds by making the compare function
optional in GIN opclass definitions, and specifying that the default btree
comparison function for the index storage type will be looked up when the
opclass omits it. Again, that seems pretty generically useful.
Since the comparison function lookup is done in initGinState(), making
use of the second feature adds an additional cache lookup to GIN index
access setup. It seems unlikely that that would be very noticeable given
the other costs involved, but maybe at some point we should consider
making GinState data persist longer than it now does --- we could keep it
in the index relcache entry, perhaps.
Rather fortuitously, we don't seem to need to do anything to get this
change to play nice with dump/reload or pg_upgrade scenarios: the new
opclass definition is automatically selected to replace existing index
definitions, and the on-disk data remains compatible. Also, if a user has
created a custom opclass definition for a non-builtin type, this doesn't
break that, since CREATE INDEX will prefer an exact match to opcintype
over a match to ANYARRAY. However, if there's anyone out there with
handwritten DDL that explicitly specifies _bool_ops or one of the other
replaced opclass names, they'll need to adjust that.
Refactor script execution state machine in pgbench.
The doCustom() function had grown into quite a mess. Rewrite it, in a more
explicit state machine style, for readability.
This also fixes one minor bug: if a script consisted entirely of meta
commands, doCustom() never returned to the caller, so progress reports
with the -P option were not printed. I don't want to backpatch this
refactoring, and the bug is quite insignificant, so only commit this to
master, and leave the bug unfixed in back-branches.
Tom Lane [Sun, 25 Sep 2016 19:40:57 +0000 (15:40 -0400)]
Refer to OS X as "macOS", except for the port name which is still "darwin".
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users. Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern. Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change. (In short, this patch touches documentation and
comments, but no actual code.)
I didn't touch contrib/start-scripts/osx/, either. I suspect those are
obsolete and due for a rewrite, anyway.
I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too. Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
Tom Lane [Fri, 23 Sep 2016 19:50:00 +0000 (15:50 -0400)]
Install TAP test infrastructure so it's available for extension testing.
When configured with --enable-tap-tests, "make install" will now install
the Perl support files for TAP testing where PGXS will find them.
This allows extensions to rely on $(prove_check) even when being built
out-of-tree. Back-patch to 9.4 where we first started to support TAP
testing, to reduce the number of cases extension makefiles need to
consider.
Tom Lane [Fri, 23 Sep 2016 18:22:07 +0000 (14:22 -0400)]
Doc: fix examples of # operators so they actually work.
These worked as-is until around 7.0, but fail in newer versions because
there are more operators named "#". Besides it's a bit inconsistent that
only two of the examples on this page lack type names on their constants.
Tom Lane [Fri, 23 Sep 2016 17:49:26 +0000 (13:49 -0400)]
Fix incorrect logic for excluding range constructor functions in pg_dump.
Faulty AND/OR nesting in the WHERE clause of getFuncs' SQL query led to
dumping range constructor functions if they are part of an extension
and we're in binary-upgrade mode. Actually, we don't want to dump them
separately even then, since CREATE TYPE AS RANGE will create the range's
constructor functions regardless. Per report from Andrew Dunstan.
It looks like this mistake was introduced by me, in commit b985d4877, in
perhaps-overzealous refactoring to reduce code duplication. I'm suitably
embarrassed.
Tom Lane [Fri, 23 Sep 2016 14:09:52 +0000 (10:09 -0400)]
Don't trust CreateFileMapping() to clear the error code on success.
We must test GetLastError() even when CreateFileMapping() returns a
non-null handle. If that value were left over from some previous system
call, we might be fooled into thinking the segment already existed.
Experimentation on Windows 7 suggests that CreateFileMapping() clears
the error code on success, but it is not documented to do so, so let's
not rely on that happening in all Windows releases.
Tom Lane [Fri, 23 Sep 2016 13:54:11 +0000 (09:54 -0400)]
Avoid using PostmasterRandom() for DSM control segment ID.
Commits 470d886c3 et al intended to fix the problem that the postmaster
selected the same "random" DSM control segment ID on every start. But
using PostmasterRandom() for that destroys the intended property that the
delay between random_start_time and random_stop_time will be unpredictable.
(Said delay is probably already more predictable than we could wish, but
that doesn't mean that reducing it by a couple orders of magnitude is OK.)
Revert the previous patch and add a comment warning against misuse of
PostmasterRandom. Fix the original problem by calling srandom() early in
PostmasterMain, using a low-security seed that will later be overwritten
by PostmasterRandom.