Tom Lane [Sun, 26 Mar 2017 21:35:35 +0000 (17:35 -0400)]
Fix unportable disregard of alignment requirements in RADIUS code.
The compiler is entitled to store a char[] local variable with no
particular alignment requirement. Our RADIUS code cavalierly took such
a local variable and cast its address to a struct type that does have
alignment requirements. On an alignment-picky machine this would lead
to bus errors. To fix, declare the local variable honestly, and then
cast its address to char * for use in the I/O calls.
Given the lack of field complaints, there must be very few if any
people affected; but nonetheless this is a clear portability issue,
so back-patch to all supported branches.
Noted while looking at a Coverity complaint in the same code.
Robert Haas [Fri, 24 Mar 2017 16:30:39 +0000 (12:30 -0400)]
plpgsql: Don't generate parallel plans for RETURN QUERY.
Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block, but that's a bad idea because plplgsql
asks the executor for 50 rows at a time. That means that we'll always
be running serially a plan that was intended for parallel execution,
which is not a good idea. Fix by not requesting a parallel plan from
the outset.
Per discussion, back-patch to 9.6. There is a slight risk that, due
to optimizer error, somebody could have a case where the parallel plan
executed serially is actually faster than the supposedly-best serial
plan, but the consensus seems to be that that's not sufficient
justification for leaving 9.6 unpatched.
Teodor Sigaev [Fri, 24 Mar 2017 16:23:13 +0000 (19:23 +0300)]
Fix pgbench options -C and -R together
The bug is that prior to --rate doCustom was always disconnect/reconnect
without exiting, but with rate it returns if it has to wait. However threadRun
test whether there is a connection before recalling doCustom, so it was never
called.
Teodor Sigaev [Fri, 24 Mar 2017 10:55:02 +0000 (13:55 +0300)]
Fix backup canceling
Assert-enabled build crashes but without asserts it works by wrong way:
it may not reset forcing full page write and preventing from starting
exclusive backup with the same name as cancelled.
Patch replaces pair of booleans
nonexclusive_backup_running/exclusive_backup_running to single enum to
correctly describe backup state.
Backpatch to 9.6 where bug was introduced
Reported-by: David Steele
Authors: Michael Paquier, David Steele Reviewed-by: Anastasia Lubennikova
https://commitfest.postgresql.org/13/1068/
Revert Windows service check refactoring, and replace with a different fix.
This reverts commit 38bdba54a64bacec78e3266f0848b0b4a824132a, "Fix and
simplify check for whether we're running as Windows service". It turns out
that older versions of MinGW - like that on buildfarm member narwhal - do
not support the CheckTokenMembership() function. This replaces the
refactoring with a much smaller fix, to add a check for SE_GROUP_ENABLED to
pgwin32_is_service().
Only apply to back-branches, and keep the refactoring in HEAD. It's
unlikely that anyone is still really using such an old version of MinGW -
aside from narwhal - but let's not change the minimum requirements in
minor releases.
Teodor Sigaev [Tue, 21 Mar 2017 13:24:10 +0000 (16:24 +0300)]
Fix support for some operators (&<, &>, $<|, |&>) in box operator class
of SP-GiST.
Bug exists since initial commit of box opclass for SP-GiST,
so backpath to 9.6
Author: Nikita Glukhov with minor editorization of tests by me Reviewed-by: Kyotaro Horiguchi, Anastasia Lubennikova
https://commitfest.postgresql.org/13/981/
Teodor Sigaev [Mon, 20 Mar 2017 15:49:41 +0000 (18:49 +0300)]
Revert unintentional change in increasing usage count during pin of buffers,
this makes buffer access strategy have no effect.
Change was a part of commit 48354581a49c30f5757c203415aa8412d85b0f70 during 9.6
release cycle, so backpath to 9.6
Reported-by: Jim Nasby
Author: Alexander Korotkov Reviewed-by: Jim Nasby, Andres Freund
https://commitfest.postgresql.org/13/1029/
Tom Lane [Fri, 17 Mar 2017 18:58:06 +0000 (14:58 -0400)]
Fix WaitEventSetWait() to handle write-ready waits properly on Windows.
Windows apparently will not detect socket write-ready events unless a
preceding send attempt returned WSAEWOULDBLOCK. In many usage patterns
that's satisfied by the caller of WaitEvenSetWait(), but not always.
Apply the same solution that we already had in pgwin32_select(), namely to
perform a dummy WSASend() call with len=0. This will return WSAEWOULDBLOCK
if there's no buffer space (even though it could legitimately do nothing
and report success, which makes me a bit nervous about this solution;
but since it's been working fine in libpq, let's roll with it).
In passing, improve the comments about this in pgwin32_select(), and remove
duplicated code there.
Back-patch to 9.6 where WaitEventSetWait() was introduced. We might need
to back-patch something similar into predecessor code. But given the lack
of complaints so far, it's not clear that the case ever gets exercised
in the back branches, so I'm not going to expend effort on it right now.
This should resolve recurring failures on buildfarm member bowerbird,
which has been failing since 1e8a85009 went in.
Diagnosis and patch by Petr Jelinek, cosmetic adjustments by me.
Andrew Gierth [Fri, 17 Mar 2017 14:35:54 +0000 (14:35 +0000)]
Repair test for vacuum reltuples fix.
Concurrent auto-analyze could be holding a snapshot, affecting the
removal of deleted row versions. Remove the deletion to avoid this
happening. Per buildfarm.
In passing, make the test independent of assumptions of physical row
order, just out of sheer paranoia.
Fix and simplify check for whether we're running as Windows service.
If the process token contains SECURITY_SERVICE_RID, but it has been
disabled by the SE_GROUP_USE_FOR_DENY_ONLY attribute, win32_is_service()
would incorrectly report that we're running as a service. That situation
arises, e.g. if postmaster is launched with a restricted security token,
with the "Log in as Service" privilege explicitly removed.
Replace the broken code with CheckProcessTokenMembership(), which does
this correctly. Also replace similar code in win32_is_admin(), even
though it got this right, for simplicity and consistency.
Per bug #13755, reported by Breen Hagan. Back-patch to all supported
versions. Patch by Takayuki Tsunakawa, reviewed by Michael Paquier.
Andrew Gierth [Thu, 16 Mar 2017 22:31:49 +0000 (22:31 +0000)]
Avoid having vacuum set reltuples to 0 on non-empty relations in the
presence of page pins, which leads to serious estimation errors in the
planner. This particularly affects small heavily-accessed tables,
especially where locking (e.g. from FK constraints) forces frequent
vacuums for mxid cleanup.
Fix by keeping separate track of pages whose live tuples were actually
counted vs. pages that were only scanned for freezing purposes. Thus,
reltuples can only be set to 0 if all pages of the relation were
actually counted.
Backpatch to all supported versions.
Per bug #14057 from Nicolas Baccelli, analyzed by me.
Robert Haas [Tue, 14 Mar 2017 15:51:11 +0000 (11:51 -0400)]
Fix failure to mark init buffers as BM_PERMANENT.
This could result in corruption of the init fork of an unlogged index
if the ambuildempty routine for that index used shared buffers to
create the init fork, which was true for brin, gin, gist, and hash
indexes.
Patch by me, based on an earlier patch by Michael Paquier, who also
reviewed this one. This also incorporates an idea from Artur
Zakirov.
Tom Lane [Mon, 13 Mar 2017 20:46:32 +0000 (16:46 -0400)]
Remove unnecessary dependency on statement_timeout in prepared_xacts test.
Rather than waiting around for statement_timeout to expire, we can just
try to take the table's lock in nowait mode. This saves some fraction
under 4 seconds when running this test with prepared xacts available,
and it guards against timeout-expired-anyway failures on very slow
machines when prepared xacts are not available, as seen in a recent
failure on axolotl for instance.
This approach could fail if autovacuum were to take an exclusive lock
on the test table concurrently, but there's no reason for it to do so.
Since the main point here is to improve stability in the buildfarm,
back-patch to all supported branches.
Michael Meskes [Mon, 13 Mar 2017 19:44:13 +0000 (20:44 +0100)]
Ecpg should support COMMIT PREPARED and ROLLBACK PREPARED.
The problem was that "begin transaction" was issued automatically
before executing COMMIT/ROLLBACK PREPARED if not in auto commit. This fix by
Masahiko Sawada fixes this.
Noah Misch [Sun, 12 Mar 2017 23:35:31 +0000 (19:35 -0400)]
Fix pg_file_write() error handling.
Detect fclose() failures; given "ln -s /dev/full $PGDATA/devfull",
"pg_file_write('devfull', 'x', true)" now fails as it should. Don't
leak a stream when fwrite() fails. Remove a born-ineffective test that
aimed to skip zero-length writes. Back-patch to 9.2 (all supported
versions).
Joe Conway [Sat, 11 Mar 2017 21:32:26 +0000 (13:32 -0800)]
Fix ancient connection leak in dblink
When using unnamed connections with dblink, every time a new
connection is made, the old one is leaked. Fix that.
This has been an issue probably since dblink was first committed.
Someone complained almost ten years ago, but apparently I decided
not to pursue it at the time, and neither did anyone else, so it
slipped between the cracks. Now that someone else has complained,
fix in all supported branches.
Tom Lane [Fri, 10 Mar 2017 19:15:09 +0000 (14:15 -0500)]
Sanitize newlines in object names in "pg_restore -l" output.
Commits 89e0bac86 et al replaced newlines with spaces in object names
printed in SQL comments, but we neglected to consider that the same
names are also printed by "pg_restore -l", and a newline would render
the output unparseable by "pg_restore -L". Apply the same replacement
in "-l" output. Since "pg_restore -L" doesn't actually examine any
object names, only the dump ID field that starts each line, this is
enough to fix things for its purposes.
The previous fix was treated as a security issue, and we might have
done that here as well, except that the issue was reported publicly
to start with. Anyway it's hard to see how this could be exploited
for SQL injection; "pg_restore -L" doesn't do much with the file
except parse it for leading integers.
Per bug #14587 from Milos Urbanek. Back-patch to all supported versions.
Tom Lane [Thu, 9 Mar 2017 22:20:11 +0000 (17:20 -0500)]
Fix timestamptz regression test to still work with latest IANA zone data.
The IANA timezone crew continues to chip away at their project of removing
timezone abbreviations that have no real-world currency from their
database. The tzdata2017a update removes all such abbreviations for
South American zones, as well as much of the Pacific. This breaks some
test cases in timestamptz.sql that were expecting America/Santiago and
America/Caracas to have non-numeric abbreviations.
The test cases involving America/Santiago seem to have selected that
zone more or less at random, so just replace it with America/New_York,
which is of similar longitude. The cases involving America/Caracas are
harder since they were chosen to test a time-varying zone abbreviation
around a point where it changed meaning in the backwards direction.
Fortunately, Europe/Moscow has a similar case in 2014, and the MSK/MSD
abbreviations are well enough attested that IANA seems unlikely to
decide to remove them from the database in future.
With these changes, this regression test should pass when using any IANA
zone database from 2015 or later. One could wish that there were a few
years more daylight on how out-of-date your zone database can be ... but
really the --with-system-tzdata option is only meant for use on platforms
where the zone database is kept up-to-date pretty faithfully, so I do not
think this is a big objection.
Tom Lane [Wed, 8 Mar 2017 17:21:12 +0000 (12:21 -0500)]
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Tom Lane [Tue, 7 Mar 2017 16:36:35 +0000 (11:36 -0500)]
Fix pgbench's failure to honor the documented long-form option "--builtin".
Not only did it not accept --builtin as a synonym for -b, but what it did
accept as a synonym was --tpc-b (huh?), which it got even further wrong
by marking as no_argument, so that if you did try that you got a core
dump. I suppose this is leftover from some early design for the new
switches added by commit 8bea3d221, but it's still pretty sloppy work.
Per bug #14580 from Stepan Pesternikov. Back-patch to 9.6 where the
error was introduced.
Stephen Frost [Tue, 7 Mar 2017 04:29:08 +0000 (23:29 -0500)]
pg_dump: Properly handle public schema ACLs with --clean
pg_dump has always handled the public schema in a special way when it
comes to the "--clean" option. To wit, we do not drop or recreate the
public schema in "normal" mode, but when we are run in "--clean" mode
then we do drop and recreate the public schema.
When running in "--clean" mode, the public schema is dropped and then
recreated and it is recreated with the normal schema-default privileges
of "nothing". This is unlike how the public schema starts life, which
is to have CREATE and USAGE GRANT'd to the PUBLIC role, and that is what
is recorded in pg_init_privs.
Due to this, in "--clean" mode, pg_dump would mistakenly only dump out
the set of privileges required to go from the initdb-time privileges on
the public schema to whatever the current-state privileges are. If the
privileges were not changed from initdb time, then no privileges would
be dumped out for the public schema, but with the schema being dropped
and recreated, the result was that the public schema would have no ACLs
on it instead of what it should have, which is the initdb-time
privileges.
Practically speaking, this meant that pg_dump with --clean mode dumping
a database where the ACLs on the public schema were not changed from the
default would, upon restore, result in a public schema with *no*
privileges GRANT'd, not matching the state of the existing database
(where the initdb-time privileges would have been CREATE and USAGE to
the PUBLIC role for the public schema).
To fix, adjust the query in getNamespaces() to ignore the pg_init_privs
entry for the public schema when running in "--clean" mode, meaning that
the privileges for the public schema would be dumped, correctly, as if
it was going from a newly-created schema to the current state (which is,
indeed, what will happen during the restore thanks to the DROP/CREATE).
Only the public schema is handled in this special way by pg_dump, no
other initdb-time objects are dropped/recreated in --clean mode.
Tom Lane [Tue, 7 Mar 2017 00:33:59 +0000 (19:33 -0500)]
Repair incorrect pg_dump labeling for some comments and security labels.
We attached no schema label to comments for procedural languages, casts,
transforms, operator classes, operator families, or text search objects.
The first three categories of objects don't really have schemas, but
pg_dump treats them as if they do, and it seems like the TocEntry fields
for their comments had better match the TocEntry fields for the parent
objects. (As an example of a possible hazard, the type names in a CAST
will be formatted with the assumption of a particular search_path, so
failing to ensure that this same path is active for the COMMENT ON command
could lead to an error or to attaching the comment to the wrong cast.)
In the last six cases, this was a flat-out error --- possibly mine to
begin with, but it was a long time ago.
The security label for a procedural language was likewise not correctly
labeled as to schema, and both the comment and security label for a
procedural language were not correctly labeled as to owner.
In simple cases the restore would accidentally work correctly anyway, since
these comments and security labels would normally get emitted right after
the owning object, and so the search path and active user would be correct
anyhow. But it could fail in corner cases; for example a schema-selective
restore would omit comments it should include.
Giuseppe Broccolo noted the oversight, and proposed the correct fix, for
text search dictionary objects; I found the rest by cross-checking other
dumpComment() calls. These oversights are ancient, so back-patch all
the way.
Stephen Frost [Mon, 6 Mar 2017 22:04:06 +0000 (17:04 -0500)]
pg_upgrade: Fix large object COMMENTS, SECURITY LABELS
When performing a pg_upgrade, we copy the files behind pg_largeobject
and pg_largeobject_metadata, allowing us to avoid having to dump out and
reload the actual data for large objects and their ACLs.
Unfortunately, that isn't all of the information which can be associated
with large objects. Currently, we also support COMMENTs and SECURITY
LABELs with large objects and these were being silently dropped during a
pg_upgrade as pg_dump would skip everything having to do with a large
object and pg_upgrade only copied the tables mentioned to the new
cluster.
As the file copies happen after the catalog dump and reload, we can't
simply include the COMMENTs and SECURITY LABELs in pg_dump's binary-mode
output but we also have to include the actual large object definition as
well. With the definition, comments, and security labels in the pg_dump
output and the file copies performed by pg_upgrade, all of the data and
metadata associated with large objects is able to be successfully pulled
forward across a pg_upgrade.
In 9.6 and master, we can simply adjust the dump bitmask to indicate
which components we don't want. In 9.5 and earlier, we have to put
explciit checks in in dumpBlob() and dumpBlobs() to not include the ACL
or the data when in binary-upgrade mode.
Adjustments made to the privileges regression test to allow another test
(large_object.sql) to be added which explicitly leaves a large object
with a comment in place to provide coverage of that case with
pg_upgrade.
Tom Lane [Mon, 6 Mar 2017 21:50:47 +0000 (16:50 -0500)]
Avoid dangling pointer to relation name in RLS code path in DoCopy().
With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE,
and would sometimes fail without that, because it used the relation name
directly from the relcache as part of the parsetree it's building. That
becomes a potentially-dangling pointer as soon as the relcache entry is
closed, a bit further down. Typical symptom if the relcache entry chanced
to get cleared would be "relation does not exist" error with a garbage
relation name, or possibly a core dump; but if you were really truly
unlucky, the COPY might copy from the wrong table.
Per report from Andrew Dunstan that regression tests fail with
-DRELCACHE_FORCE_RELEASE. The core tests now pass for me (but have
not tried "make check-world" yet).
Tom Lane [Sat, 4 Mar 2017 21:09:33 +0000 (16:09 -0500)]
In rebuild_relation(), don't access an already-closed relcache entry.
This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by
Andrew Dunstan, and could sometimes fail in normal operation, resulting
in a wrong persistence value being used for the transient table.
It's not immediately clear to me what effects that might have beyond
the risk of a crash while accessing OldHeap->rd_rel->relpersistence,
but it's probably not good.
Bug introduced by commit f41872d0c, and made substantially worse by
commit 85b506bbf, which added a second such access significantly
later than the heap_close. I doubt the first reference could fail
in a production scenario, but the second one definitely could.
Noah Misch [Thu, 2 Mar 2017 05:03:27 +0000 (00:03 -0500)]
Handle unaligned SerializeSnapshot() buffer.
Likewise in RestoreSnapshot(). Do so by copying between the user buffer
and a stack buffer of known alignment. Back-patch to 9.6, where this
last applies cleanly. In master, the select_parallel test dies with
SIGBUS on "Oracle Solaris 10 1/13 s10s_u11wos_24a SPARC", building
32-bit with gcc 4.9.2. In 9.6 and 9.5, the buffers in question happen
to be sufficiently-aligned, and this change is mere insurance against
future 9.6 changes or extension code compromising that.
Bruce Momjian [Sat, 25 Feb 2017 17:59:23 +0000 (12:59 -0500)]
pg_upgrade docs: clarify instructions on standby extensions
Previously the pg_upgrade standby upgrade instructions said not to
execute pgcrypto.sql, but it should have referenced the extension
command "CREATE EXTENSION pgcrypto". This patch makes that doc change.
Reported-by: a private bug report
Backpatch-through: 9.4, where standby instructions were added
Tom Lane [Fri, 24 Feb 2017 20:21:39 +0000 (15:21 -0500)]
Fix unportable definition of BSWAP64() macro.
We have a portable way of writing uint64 constants, but whoever wrote
this macro didn't know about it.
While at it, fix unsafe under-parenthesization of arguments. That might
be moot, because there are already good reasons not to use the macro on
anything more complicated than a simple variable, but it's still poor
practice.
Tom Lane [Wed, 22 Feb 2017 20:04:07 +0000 (15:04 -0500)]
Fix contrib/pg_trgm's extraction of trigrams from regular expressions.
The logic for removing excess trigrams from the result was faulty.
It intends to avoid merging the initial and final states of the NFA,
which is necessary, but in testing whether removal of a specific trigram
would cause that, it failed to consider the combined effects of all the
state merges that that trigram's removal would cause. This could result
in a broken final graph that would never match anything, leading to GIN
or GiST indexscans not finding anything.
To fix, add a "tentParent" field that is used only within this loop,
and set it to show state merges that we are tentatively going to do.
While examining a particular arc, we must chase up through tentParent
links as well as regular parent links (the former can only appear atop
the latter), and we must account for state init/fin flag merges that
haven't actually been done yet.
To simplify the latter, combine the separate init and fin bool fields
into a bitmap flags field. I also chose to get rid of the "children"
state list, which seems entirely inessential.
Per bug #14563 from Alexey Isayko, which the added test cases are based on.
Back-patch to 9.3 where this code was added.
Fujii Masao [Tue, 21 Feb 2017 18:11:58 +0000 (03:11 +0900)]
Make walsender always initialize the buffers.
Walsender uses the local buffers for each outgoing and incoming message.
Previously when creating replication slot, walsender forgot to initialize
one of them and which can cause the segmentation fault error. To fix this
issue, this commit changes walsender so that it always initialize them
before it executes the requested replication command.
Back-patch to 9.4 where replication slot was introduced.
Problem report and initial patch by Stas Kelvich, modified by me.
Report: https://www.postgresql.org/message-id/A1E9CB90-1FAC-4CAD-8DBA-9AA62A6E97C5@postgrespro.ru
Tom Lane [Tue, 21 Feb 2017 22:51:27 +0000 (17:51 -0500)]
Fix sloppy handling of corner-case errors in fd.c.
Several places in fd.c had badly-thought-through handling of error returns
from lseek() and close(). The fact that those would seldom fail on valid
FDs is probably the reason we've not noticed this up to now; but if they
did fail, we'd get quite confused.
LruDelete and LruInsert actually just Assert'd that lseek never fails,
which is pretty awful on its face.
In LruDelete, we indeed can't throw an error, because that's likely to get
called during error abort and so throwing an error would probably just lead
to an infinite loop. But by the same token, throwing an error from the
close() right after that was ill-advised, not to mention that it would've
left the LRU state corrupted since we'd already unlinked the VFD from the
list. I also noticed that really, most of the time, we should know the
current seek position and it shouldn't be necessary to do an lseek here at
all. As patched, if we don't have a seek position and an lseek attempt
doesn't give us one, we'll close the file but then subsequent re-open
attempts will fail (except in the somewhat-unlikely case that a
FileSeek(SEEK_SET) call comes between and allows us to re-establish a known
target seek position). This isn't great but it won't result in any state
corruption.
Meanwhile, having an Assert instead of an honest test in LruInsert is
really dangerous: if that lseek failed, a subsequent read or write would
read or write from the start of the file, not where the caller expected,
leading to data corruption.
In both LruDelete and FileClose, if close() fails, just LOG that and mark
the VFD closed anyway. Possibly leaking an FD is preferable to getting
into an infinite loop or corrupting the VFD list. Besides, as far as I can
tell from the POSIX spec, it's unspecified whether or not the file has been
closed, so treating it as still open could be the wrong thing anyhow.
I also fixed a number of other places that were being sloppy about
behaving correctly when the seekPos is unknown.
Also, I changed FileSeek to return -1 with EINVAL for the cases where it
detects a bad offset, rather than throwing a hard elog(ERROR). It seemed
pretty inconsistent that some bad-offset cases would get a failure return
while others got elog(ERROR). It was missing an offset validity check for
the SEEK_CUR case on a closed file, too.
Back-patch to all supported branches, since all this code is fundamentally
identical in all of them.
Tom Lane [Mon, 20 Feb 2017 15:05:00 +0000 (10:05 -0500)]
Fix documentation of to_char/to_timestamp TZ, tz, OF formatting patterns.
These are only supported in to_char, not in the other direction, but the
documentation failed to mention that. Also, describe TZ/tz as printing the
time zone "abbreviation", not "name", because what they print is elsewhere
referred to that way. Per bug #14558.
Tom Lane [Sun, 19 Feb 2017 21:14:52 +0000 (16:14 -0500)]
Adjust PL/Tcl regression test to dodge a possible bug or zone dependency.
One case in the PL/Tcl tests is observed to fail on RHEL5 with a Turkish
time zone setting. It's not clear if this is an old Tcl bug or something
odd about the zone data, but in any case that test is meant to see if the
Tcl [clock] command works at all, not what its corner-case behaviors are.
Therefore we have no need to test exactly which week a Sunday midnight is
considered to fall into. Probe the following Tuesday instead.
Magnus Hagander [Sat, 18 Feb 2017 12:45:14 +0000 (13:45 +0100)]
Fix help message for pg_basebackup -R
The recovery.conf file that's generated is specifically for replication,
and not needed (or wanted) for regular backup restore, so indicate that
in the message.
Tom Lane [Thu, 16 Feb 2017 16:30:07 +0000 (11:30 -0500)]
Doc: remove duplicate index entry.
This causes a warning with the old html-docs toolchain, though not with the
new. I had originally supposed that we needed both <indexterm> entries to
get both a primary index entry and a see-also link; but evidently not,
as pointed out by Fabien Coelho.
Tom Lane [Wed, 15 Feb 2017 23:15:47 +0000 (18:15 -0500)]
Formatting and docs corrections for logical decoding output plugins.
Make the typedefs for output plugins consistent with project style;
they were previously not even consistent with each other as to layout
or inclusion of parameter names. Make the documentation look the same,
and fix errors therein (missing and misdescribed parameters).
Tom Lane [Wed, 15 Feb 2017 21:40:05 +0000 (16:40 -0500)]
Make sure that hash join's bulk-tuple-transfer loops are interruptible.
The loops in ExecHashJoinNewBatch(), ExecHashIncreaseNumBatches(), and
ExecHashRemoveNextSkewBucket() are all capable of iterating over many
tuples without ever doing a CHECK_FOR_INTERRUPTS, so that the backend
might fail to respond to SIGINT or SIGTERM for an unreasonably long time.
Fix that. In the case of ExecHashJoinNewBatch(), it seems useful to put
the added CHECK_FOR_INTERRUPTS into ExecHashJoinGetSavedTuple() rather
than directly in the loop, because that will also ensure that both
principal code paths through ExecHashJoinOuterGetTuple() will do a
CHECK_FOR_INTERRUPTS, which seems like a good idea to avoid surprises.
Tom Lane [Wed, 15 Feb 2017 20:23:19 +0000 (15:23 -0500)]
Fix tab completion for "ALTER SYSTEM SET variable ...".
It wouldn't complete "TO" after the variable name, which is certainly
minor enough. But since we do complete "TO" after "SET variable ...",
and since this case used to work pre-9.6, I think this is a bug.
Also, fix the query used to collect the variable names; whoever last
touched it evidently didn't understand how the pieces are supposed
to fit together. It accidentally worked anyway, because readline
ignores irrelevant completions, but it was randomly unlike the ones
around it, and could be a source of actual bugs if someone copied
it as a prototype for another query.
Tom Lane [Wed, 15 Feb 2017 19:44:00 +0000 (14:44 -0500)]
Fix YA unwanted behavioral difference with operator_precedence_warning.
Jeff Janes noted that the error cursor position shown for some errors
would vary when operator_precedence_warning is turned on. We'd prefer
that option to have no undocumented effects, so this isn't desirable.
To fix, make sure that an AEXPR_PAREN node has the same exprLocation
as its child node.
(Note: it would be a little cheaper to use @2 here instead of an
exprLocation call, but there are cases where that wouldn't produce
the identical answer, so don't do it like that.)
Back-patch to 9.5 where this feature was introduced.
Noah Misch [Sun, 12 Feb 2017 21:03:41 +0000 (16:03 -0500)]
Ignore tablespace ACLs when ignoring schema ACLs.
The ALTER TABLE ALTER TYPE implementation can issue DROP INDEX and
CREATE INDEX to refit existing indexes for the new column type. Since
this CREATE INDEX is an implementation detail of an index alteration,
the ensuing DefineIndex() should skip ACL checks specific to index
creation. It already skips the namespace ACL check. Make it skip the
tablespace ACL check, too. Back-patch to 9.2 (all supported versions).
Tom Lane [Thu, 9 Feb 2017 20:49:57 +0000 (15:49 -0500)]
Blind try to fix portability issue in commit 8f93bd851 et al.
The S/390 members of the buildfarm are showing failures indicating
that they're having trouble with the rint() calls I added yesterday.
There's no good reason for that, and I wonder if it is a compiler bug
similar to the one we worked around in d9476b838. Try to fix it using
the same method as before, namely to store the result of rint() back
into a "double" variable rather than immediately converting to int64.
(This isn't entirely waving a dead chicken, since on machines with
wider-than-double float registers, the extra store forces a width
conversion. I don't know if S/390 is like that, but it seems worth
trying.)
In passing, merge duplicate ereport() calls in float8_timestamptz().
Tom Lane [Wed, 8 Feb 2017 23:04:59 +0000 (18:04 -0500)]
Fix roundoff problems in float8_timestamptz() and make_interval().
When converting a float value to integer microseconds, we should be careful
to round the value to the nearest integer, typically with rint(); simply
assigning to an int64 variable will truncate, causing apparently off-by-one
values in cases that should work. Most places in the datetime code got
this right, but not these two.
float8_timestamptz() is new as of commit e511d878f (9.6). Previous
versions effectively depended on interval_mul() to do roundoff correctly,
which it does, so this fixes an accuracy regression in 9.6.
The problem in make_interval() dates to its introduction in 9.4. Aside
from being careful to round not truncate, let's incorporate the hours and
minutes inputs into the result with exact integer arithmetic, rather than
risk introducing roundoff error where there need not have been any.
float8_timestamptz() problem reported by Erik Nordström, though this is
not his proposed patch. make_interval() problem found by me.
Tom Lane [Tue, 7 Feb 2017 15:24:25 +0000 (10:24 -0500)]
Correct thinko in last-minute release note item.
The CREATE INDEX CONCURRENTLY bug can only be triggered by row updates,
not inserts, since the problem would arise from an update incorrectly
being made HOT. Noted by Alvaro.
Tom Lane [Mon, 6 Feb 2017 18:19:51 +0000 (13:19 -0500)]
Avoid returning stale attribute bitmaps in RelationGetIndexAttrBitmap().
The problem with the original coding here is that we might receive (and
clear) a relcache invalidation signal for the target relation down inside
one of the index_open calls we're doing. Since the target is open, we
would not drop the relcache entry, just reset its rd_indexvalid and
rd_indexlist fields. But RelationGetIndexAttrBitmap() kept going, and
would eventually cache and return potentially-obsolete attribute bitmaps.
The case where this matters is where the inval signal was from a CREATE
INDEX CONCURRENTLY telling us about a new index on a formerly-unindexed
column. (In all other cases, the lock we hold on the target rel should
prevent any concurrent change in index state.) Even just returning the
stale attribute bitmap is not such a problem, because it shouldn't matter
during the transaction in which we receive the signal. What hurts is
caching the stale data, because it can survive into later transactions,
breaking CREATE INDEX CONCURRENTLY's expectation that later transactions
will not create new broken HOT chains. The upshot is that there's a window
for building corrupted indexes during CREATE INDEX CONCURRENTLY.
This patch fixes the problem by rechecking that the set of index OIDs
is still the same at the end of RelationGetIndexAttrBitmap() as it was
at the start. If not, we loop back and try again. That's a little
more than is strictly necessary to fix the bug --- in principle, we
could return the stale data but not cache it --- but it seems like a
bad idea on general principles for relcache to return data it knows
is stale.
There might be more hazards of the same ilk, or there might be a better
way to fix this one, but this patch definitely improves matters and seems
unlikely to make anything worse. So let's push it into today's releases
even as we continue to study the problem.
Tom Lane [Fri, 3 Feb 2017 00:11:27 +0000 (19:11 -0500)]
Fix placement of initPlans when forcibly materializing a subplan.
If we forcibly place a Material node atop a finished subplan, we need
to move any initPlans attached to the subplan up to the Material node,
in order to keep SS_finalize_plan() happy. I'd figured this out in
commit 7b67a0a49 for the case of materializing a cursor plan, but out of
an abundance of caution, I put the initPlan movement hack at the call
site for that case, rather than inside materialize_finished_plan().
That was the wrong thing, because it turns out to also be necessary for
the only other caller of materialize_finished_plan(), ie subselect.c.
We lacked any test cases that exposed the mistake, but bug#14524 from
Wei Congrui shows that it's possible to get an initPlan reference into
the top tlist in that case too, and then SS_finalize_plan() complains.
Hence, move the hack into materialize_finished_plan().
In HEAD, also relocate some recently-added tests in subselect.sql, which
I'd unthinkingly dropped into the middle of a sequence of related tests.
Andrew Dunstan [Wed, 1 Feb 2017 22:52:35 +0000 (17:52 -0500)]
Don't count background workers against a user's connection limit.
Doing so doesn't seem to be within the purpose of the per user
connection limits, and has particularly unfortunate effects in
conjunction with parallel queries.
Backpatch to 9.6 where parallel queries were introduced.
David Rowley, reviewed by Robert Haas and Albe Laurenz.
Stephen Frost [Tue, 31 Jan 2017 21:24:14 +0000 (16:24 -0500)]
pg_dump: Fix handling of ALTER DEFAULT PRIVILEGES
In commit 23f34fa, we changed how ACLs were handled to use the new
pg_init_privs catalog and to dump out the ACL commands as REVOKE+GRANT
combinations instead of trying to REVOKE all rights always and then
GRANT back just the ones which were in place.
Unfortunately, the DEFAULT PRIVILEGES system didn't quite get the
correct treatment with this change and ended up (incorrectly) only
including positive GRANTs instead of both the REVOKEs and GRANTs
necessary to preserve the correct privileges.
There are only a couple cases where such REVOKEs are possible because,
generally speaking, there's few rights which exist on objects by
default to be revoked.
Examples of REVOKEs which weren't being correctly preserved are when
privileges are REVOKE'd from the creator/owner, like so:
ALTER DEFAULT PRIVILEGES
FOR ROLE myrole
REVOKE SELECT ON TABLES FROM myrole;
or when other default privileges are being revoked, such as EXECUTE
rights granted to public for functions:
ALTER DEFAULT PRIVILEGES
FOR ROLE myrole
REVOKE EXECUTE ON FUNCTIONS FROM PUBLIC;
Fix this by correctly working out what the correct REVOKE statements are
(if any) and dump them out, just as we do for everything else.
Noticed while developing additional regression tests for pg_dump, which
will be landing shortly.
Stephen Frost [Tue, 31 Jan 2017 16:17:40 +0000 (11:17 -0500)]
test_pg_dump: perltidy cleanup
As pointed out by Alvaro, we actually use perltidy on the perl scripts
in the source tree, so go back to the results of a perltidy run for the
test_pg_dump TAP script.
To make it look slightly less tragic, I changed most of the independent
arguments into long-form single arguments (eg: -f file.sql changed to be
--file=file.sql) to avoid having them confusingly split across lines due
to perltidy.
Tom Lane [Mon, 30 Jan 2017 16:40:22 +0000 (11:40 -0500)]
Update time zone data files to tzdata release 2016j.
DST law changes in northern Cyprus (new zone Asia/Famagusta), Russia (new
zone Europe/Saratov), Tonga, Antarctica/Casey. Historical corrections for
Asia/Aqtau, Asia/Atyrau, Asia/Gaza, Asia/Hebron, Italy, Malta. Replace
invented zone abbreviation "TOT" for Tonga with numeric UTC offset; but
as in the past, we'll keep accepting "TOT" for input.
Stephen Frost [Mon, 30 Jan 2017 04:05:09 +0000 (23:05 -0500)]
Handle ALTER EXTENSION ADD/DROP with pg_init_privs
In commit 6c268df, pg_init_privs was added to track the initial
privileges of catalog objects and extensions. Unfortunately, that
commit didn't include understanding of ALTER EXTENSION ADD/DROP, which
allows the objects associated with an extension to be changed after the
initial CREATE EXTENSION script has been run.
The result of this meant that ACLs for objects added through
ALTER EXTENSION ADD were not recorded into pg_init_privs and we would
end up including those ACLs in pg_dump when we shouldn't have.
This commit corrects that by making sure to have pg_init_privs updated
when ALTER EXTENSION ADD/DROP is run, recording the permissions as they
are at ALTER EXTENSION ADD time, and removing any if/when ALTER
EXTENSION DROP is called.
This issue was pointed out by Moshe Jacobson as commentary on bug #14456
(which was actually a bug about versions prior to 9.6 not handling
custom ACLs on extensions correctly, an issue now addressed with
pg_init_privs in 9.6).
Back-patch to 9.6 where pg_init_privs was introduced.
Stephen Frost [Mon, 30 Jan 2017 04:05:09 +0000 (23:05 -0500)]
test_pg_dump TAP test whitespace cleanup
The formatting of the perl hashes used in the TAP tests for test_pg_dump
was rather horribly inconsistent and made it more difficult than it
really should have been to add new tests or adjust what tests are for
what runs, etc.
Andres Freund [Fri, 27 Jan 2017 00:47:03 +0000 (16:47 -0800)]
Add castNode(type, ptr) for safe casting between NodeTag based types.
The new function allows to cast from one NodeTag based type to
another, while asserting that the conversion is valid. This replaces
the common pattern of doing a cast and a Assert(IsA(ptr, type))
close-by.
As this seems likely to be used pervasively, we decided to backpatch
this change the addition of this macro. Otherwise backpatched fixes
are more likely not to work on back-branches.
On branches before 9.6, where we do not yet rely on inline functions
being available, the type assertion is only performed if PG_USE_INLINE
support is detected. The cast obviously is performed regardless.
For the benefit of verifying the macro compiles in the back-branches,
this commit contains a single use of the new macro. On master, a
somewhat larger conversion will be committed separately.
Author: Peter Eisentraut and Andres Freund Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com
Backpatch: 9.2-
Alvaro Herrera [Thu, 26 Jan 2017 20:45:22 +0000 (17:45 -0300)]
Remove test for COMMENT ON DATABASE
Our current DDL only allows a database name to be specified in COMMENT
ON DATABASE, which Andrew Dunstan reports to make this test fail on the
buildfarm. Remove the line until we gain a DDL command that allows the
current database to be operated on without having the specify it by
name.
Simon Riggs [Thu, 26 Jan 2017 20:06:44 +0000 (20:06 +0000)]
Reset hot standby xmin after restart
Hot_standby_feedback could be reset by reload and worked correctly, but if
the server was restarted rather than reloaded the xmin was not reset.
Force reset always if hot_standby_feedback is enabled at startup.
Tom Lane [Thu, 26 Jan 2017 17:17:47 +0000 (12:17 -0500)]
Ensure that a tsquery like '!foo' matches empty tsvectors.
!foo means "the tsvector does not contain foo", and therefore it should
match an empty tsvector. ts_match_vq() overenthusiastically supposed
that an empty tsvector could never match any query, so it forcibly
returned FALSE, the wrong answer. Remove the premature optimization.
Our behavior on this point was inconsistent, because while seqscans and
GIST index searches both failed to match empty tsvectors, GIN index
searches would find them, since GIN scans don't rely on ts_match_vq().
That makes this certainly a bug, not a debatable definition disagreement,
so back-patch to all supported branches.
Report and diagnosis by Tom Dunstan (bug #14515); added test cases by me.
Robert Haas [Fri, 20 Jan 2017 20:55:45 +0000 (15:55 -0500)]
Avoid useless respawining the autovacuum launcher at high speed.
When (1) autovacuum = off and (2) there's at least one database with
an XID age greater than autovacuum_freeze_max_age and (3) all tables
in that database that need vacuuming are already being processed by a
worker and (4) the autovacuum launcher is started, a kind of infinite
loop occurs. The launcher starts a worker and immediately exits. The
worker, finding no worker to do, immediately starts the launcher,
supposedly so that the next database can be processed. But because
datfrozenxid for that database hasn't been advanced yet, the new
worker gets put right back into the same database as the old one,
where it once again starts the launcher and exits. High-speed ping
pong ensues.
There are several possible ways to break the cycle; this seems like
the safest one.
Amit Khandekar (code) and Robert Haas (comments), reviewed by
Álvaro Herrera.
Stephen Frost [Thu, 19 Jan 2017 17:06:27 +0000 (12:06 -0500)]
Dump sequence data based on the TableDataInfo flag
When considering a sequence's Data entry in dumpSequenceData, we were
actually looking at the sequence definition's dump flag to decide if we
should dump the data or not. That's generally fine, except for when the
sequence data entry was created by processExtensionTables() because it's
a config sequence. In that case, the sequence itself won't be marked as
dumping data because it's part of an extension, leading to the need for
processExtensionTables() to create the sequence data entry.
This leads to extension config sequence data not being included in the
dump when it should be. Fix this by looking at the sequence data's dump
flag instead, just as dumpTableData() was doing for tables (which is why
config tables were correctly being handled), and add a regression test
to make sure we don't break it moving forward.
All of this is a bit round-about since we can now represent which
components of a given dump item should be dumped out through the dump
flag. A future improvement might be to change checkExtensionMembership()
to check for config sequences/tables and set the dump flag based on that
directly, possibly removing the need for processExtensionTables().
Tom Lane [Wed, 18 Jan 2017 21:33:18 +0000 (16:33 -0500)]
Reset the proper GUC in create_index test.
Thinko in commit a4523c5aa. It doesn't really affect anything at
present, but it would be a problem if any tests added later in this
file ought to get index-only-scan plans. Back-patch, like the previous
commit, just to avoid surprises in case we add such a test and then
back-patch it.
Alvaro Herrera [Wed, 18 Jan 2017 21:06:13 +0000 (18:06 -0300)]
Change some test macros to return true booleans
These macros work fine when they are used directly in an "if" test or
similar, but as soon as the return values are assigned to boolean
variables (or passed as boolean arguments to some function), they become
bugs, hopefully caught by compiler warnings. To avoid future problems,
fix the definitions so that they return actual booleans.
To further minimize the risk that somebody uses them in back-patched
fixes that only work correctly in branches starting from the current
master and not in old ones, back-patch the change to supported branches
as appropriate.
Tom Lane [Wed, 18 Jan 2017 20:21:52 +0000 (15:21 -0500)]
Disable transforms that replaced AT TIME ZONE with RelabelType.
These resulted in wrong answers if the relabeled argument could be matched
to an index column, as shown in bug #14504 from Evgeniy Kozlov. We might
be able to resurrect these optimizations by adjusting the planner's
treatment of RelabelType, or by adjusting btree's rules for selecting
comparison functions, but either solution will take careful analysis
and does not sound like a fit candidate for backpatching.
I left the catalog infrastructure in place and just reduced the transform
functions to always-return-NULL. This would be necessary anyway in the
back branches, and it doesn't seem important to be more invasive in HEAD.
Bug introduced by commit b8a18ad48. Back-patch to 9.5 where that came in.
Fujii Masao [Tue, 17 Jan 2017 08:29:15 +0000 (17:29 +0900)]
Fix an assertion failure related to an exclusive backup.
Previously multiple sessions could execute pg_start_backup() and
pg_stop_backup() to start and stop an exclusive backup at the same time.
This could trigger the assertion failure of
"FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)".
This happend because, even while pg_start_backup() was starting
an exclusive backup, other session could run pg_stop_backup()
concurrently and mark the backup as not-in-progress unconditionally.
This patch introduces ExclusiveBackupState indicating the state of
an exclusive backup. This state is used to ensure that there is only
one session running pg_start_backup() or pg_stop_backup() at
the same time, to avoid the assertion failure.
Back-patch to all supported versions.
Author: Michael Paquier Reviewed-By: Kyotaro Horiguchi and me Reported-By: Andreas Seltenreich
Discussion: <87mvktojme.fsf@credativ.de>
Tom Lane [Sat, 14 Jan 2017 18:27:47 +0000 (13:27 -0500)]
Throw suitable error for COPY TO STDOUT/FROM STDIN in a SQL function.
A client copy can't work inside a function because the FE/BE wire protocol
doesn't support nesting of a COPY operation within query results. (Maybe
it could, but the protocol spec doesn't suggest that clients should support
this, and libpq for one certainly doesn't.)
In most PLs, this prohibition is enforced by spi.c, but SQL functions don't
use SPI. A comparison of _SPI_execute_plan() and init_execution_state()
shows that rejecting client COPY is the only discrepancy in what they
allow, so there's no other similar bugs.
This is an astonishingly ancient oversight, so back-patch to all supported
branches.
Peter Eisentraut [Fri, 13 Jan 2017 17:00:00 +0000 (12:00 -0500)]
pg_upgrade: Fix for changed pg_ctl default stop mode
In 9.5, the default pg_ctl stop mode was changed from "smart" to "fast".
pg_upgrade still thought the default mode was "smart" and only specified
the mode when "fast" was asked for. This results in using "fast" all
the time. It's not clear what the effect in practice is, but fix it
nonetheless to restore the previous behavior.
Robert Haas [Fri, 13 Jan 2017 18:29:31 +0000 (13:29 -0500)]
Fix cardinality estimates for parallel joins.
For a partial path, the cardinality estimate needs to reflect the
number of rows we think each worker will see, rather than the total
number of rows; otherwise, costing will go wrong. The previous coding
got this completely wrong for parallel joins.
Unfortunately, this change may destabilize plans for users of 9.6 who
have enabled parallel query, but since 9.6 is still fairly new I'm
hoping expectations won't be too settled yet. Also, this is really a
brown-paper-bag bug, so leaving it unfixed for the entire lifetime of
9.6 seems unwise.
Related reports (whose import I initially failed to recognize) by
Tomas Vondra and Tom Lane.
Stephen Frost [Wed, 11 Jan 2017 20:45:53 +0000 (15:45 -0500)]
pg_restore: Don't allow non-positive number of jobs
pg_restore will currently accept invalid values for the number of
parallel jobs to run (eg: -1), unlike pg_dump which does check that the
value provided is reasonable.
Worse, '-1' is actually a valid, independent, parameter (as an alias for
--single-transaction), leading to potentially completely unexpected
results from a command line such as:
-> pg_restore -j -1
Where a user would get neither parallel jobs nor a single-transaction.
Add in validity checking of the parallel jobs option, as we already have
in pg_dump, before we try to open up the archive. Also move the check
that we haven't been asked to run more parallel jobs than possible on
Windows to the same place, so we do all the option validity checking
before opening the archive.
Back-patch all the way, though for 9.2 we're adding the Windows-specific
check against MAXIMUM_WAIT_OBJECTS as that check wasn't back-patched
originally.
Stephen Frost [Tue, 10 Jan 2017 16:34:55 +0000 (11:34 -0500)]
pg_dump: Strict names with no matching schema
When using pg_dump --strict-names and a schema pattern which doesn't
match any schemas (eg: --schema='nonexistant*'), we were incorrectly
throwing an error claiming no tables were found when, really, there
were no schemas found:
-> pg_dump --strict-names --schema='nonexistant*'
pg_dump: no matching tables were found for pattern "nonexistant*"
Fix that by changing the error message to say 'schemas' instead, since
that is what we are actually complaining about.
Noticed while testing pg_dump error cases.
Back-patch to 9.6 where --strict-names and this error message were
introduced.
Alvaro Herrera [Mon, 9 Jan 2017 22:26:58 +0000 (19:26 -0300)]
Fix ALTER TABLE / SET TYPE for irregular inheritance
If inherited tables don't have exactly the same schema, the USING clause
in an ALTER TABLE / SET DATA TYPE misbehaves when applied to the
children tables since commit 9550e8348b79. Starting with that commit,
the attribute numbers in the USING expression are fixed during parse
analysis. This can lead to bogus errors being reported during
execution, such as:
ERROR: attribute 2 has wrong type
DETAIL: Table has type smallint, but query expects integer.
Since it wouldn't do to revert to the original coding, we now apply a
transformation to map the attribute numbers to the correct ones for each
child.
Reported by Justin Pryzby
Analysis by Tom Lane; patch by me.
Discussion: https://postgr.es/m/20170102225618.GA10071@telsasoft.com
Alvaro Herrera [Mon, 9 Jan 2017 21:19:29 +0000 (18:19 -0300)]
BRIN revmap pages are not standard pages ...
... and therefore we ought not to tell XLogRegisterBuffer the opposite,
when writing XLog for a brin update that moves the index tuple to a
different page. Otherwise, xlog insertion would try to "compress the
hole" when producing a full-page image for it; but since we don't update
pd_lower/upper, the hole covers the whole page. On WAL replay, the
revmap page becomes empty and so the entire portion of the index is
useless and needs to be recomputed.
This is low-probability: a BRIN update only moves an index tuple to a
different page when the summary tuple is larger than the existing one,
which doesn't happen with fixed-width datatypes. Also, the revmap
page must be first after a checkpoint.
Report and patch: Kuntal Ghosh
Bug is alleged to have detected by a WAL-consistency-checking tool.
Discussion: https://postgr.es/m/CAGz5QCJ=00UQjScSEFbV=0qO5ShTZB9WWz_Fm7+Wd83zPs9Geg@mail.gmail.com
I posted a test case demonstrating the problem, but I'm refraining from
adding it to the test suite; if the WAL consistency tool makes it in,
that will be a better way to catch this from regressing. (We should
definitely have someting that causes not-same-page updates, though.)
Stephen Frost [Fri, 6 Jan 2017 20:27:50 +0000 (15:27 -0500)]
Protect against NULL-dereference in pg_dump
findTableByOid() is allowed to return NULL and we should therefore be
checking for that case. getOwnedSeqs() and dumpSequence() shouldn't
ever actually see this happen, but given odd circumstances it might and
commit f9e439b1 probably shouldn't have removed that check.
Pointed out by Coverity. Initial patch from Michael Paquier.
Back-patch to 9.6, where that commit had removed the check.