Michael Paquier [Wed, 24 Jul 2019 01:54:32 +0000 (10:54 +0900)]
Improve stability of TAP test for synchronous replication
Slow buildfarm machines have run into issues with this TAP test caused
by a race condition related to the startup of a set of standbys, where
it is possible to finish with an unexpected order in the WAL sender
array of the primary.
This closes the race condition by making sure that any standby started
is registered into the WAL sender array of the primary before starting
the next one based on lookups of pg_stat_replication.
Backpatch down to 9.6 where the test has been introduced.
Tom Lane [Mon, 22 Jul 2019 18:55:23 +0000 (14:55 -0400)]
Install dependencies to prevent dropping partition key columns.
The logic in ATExecDropColumn that rejects dropping partition key
columns is quite an inadequate defense, because it doesn't execute
in cases where a column needs to be dropped due to cascade from
something that only the column, not the whole partitioned table,
depends on. That leaves us with a badly broken partitioned table;
even an attempt to load its relcache entry will fail.
We really need to have explicit pg_depend entries that show that the
column can't be dropped without dropping the whole table. Hence,
add those entries. In v12 and HEAD, bump catversion to ensure that
partitioned tables will have such entries. We can't do that in
released branches of course, so in v10 and v11 this patch affords
protection only to partitioned tables created after the patch is
installed. Given the lack of field complaints (this bug was found
by fuzz-testing not by end users), that's probably good enough.
In passing, fix ATExecDropColumn and ATPrepAlterColumnType
messages to be more specific about which partition key column
they're complaining about.
Per report from Manuel Rigger. Back-patch to v10 where partitioned
tables were added.
Jeff Davis [Thu, 18 Jul 2019 23:49:10 +0000 (16:49 -0700)]
Fix daterange canonicalization for +/- infinity.
The values 'infinity' and '-infinity' are a part of the DATE type
itself, so a bound of the date 'infinity' is not the same as an
unbounded/infinite range. However, it is still wrong to try to
canonicalize such values, because adding or subtracting one has no
effect. Fix by treating 'infinity' and '-infinity' the same as
unbounded ranges for the purposes of canonicalization (but not other
purposes).
Backpatch to all versions because it is inconsistent with the
documented behavior. Note that this could be an incompatibility for
applications relying on the behavior contrary to the documentation.
Author: Laurenz Albe Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/77f24ea19ab802bc9bc60ddbb8977ee2d646aec1.camel%40cybertec.at
Backpatch-through: 9.4
Tom Lane [Wed, 17 Jul 2019 22:26:24 +0000 (18:26 -0400)]
Sync our copy of the timezone library with IANA release tzcode2019b.
A large fraction of this diff is just due to upstream's somewhat
random decision to rename a bunch of internal variables and struct
fields. However, there is an interesting new feature in zic:
it's grown a "-b slim" option that emits zone files without 32-bit
data and other backwards-compatibility hacks. We should consider
whether we wish to enable that.
Tom Lane [Tue, 16 Jul 2019 22:17:47 +0000 (18:17 -0400)]
Fix thinko in construction of old_conpfeqop list.
This should lappend the OIDs, not lcons them; the existing code produced
a list in reversed order. This is harmless for single-key FKs or FKs
where all the key columns are of the same type, which probably explains
how it went unnoticed. But if those conditions are not met,
ATAddForeignKeyConstraint would make the wrong decision about whether an
existing FK needs to be revalidated. I think it would almost always err
in the safe direction by revalidating a constraint that didn't need it.
You could imagine scenarios where the pfeqop check was fooled by
swapping the types of two FK columns in one ALTER TABLE, but that case
would probably be rejected by other tests, so it might be impossible to
get to the worst-case scenario where an FK should be revalidated and
isn't. (And even then, it's likely to be fine, unless there are weird
inconsistencies in the equality behavior of the replacement types.)
However, this is a performance bug at least.
Noted while poking around to see whether lcons calls could be converted
to lappend.
This bug is old, dating to commit cb3a7c2b9, so back-patch to all
supported branches.
Michael Paquier [Wed, 10 Jul 2019 06:15:11 +0000 (15:15 +0900)]
Fix variable initialization when using buffering build with GiST
This can cause valgrind to complain, as the flag marking a buffer as a
temporary copy was not getting initialized.
While on it, fill in with zeros newly-created buffer pages. This does
not matter when loading a block from a temporary file, but it makes the
push of an index tuple into a new buffer page safer.
This has been introduced by 1d27dcf, so backpatch all the way down to
9.4.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org
Backpatch-through: 9.4
Thomas Munro [Tue, 9 Jul 2019 22:16:02 +0000 (10:16 +1200)]
Pass QueryEnvironment down to EvalPlanQual's EState.
Otherwise the executor can't see trigger transition tables during
EPQ evaluation. Fixes bug #15900 and almost certainly also #15720.
Back-patch to 10, where trigger transition tables landed.
Author: Alex Aktsipetrov Reviewed-by: Thomas Munro, Tom Lane
Discussion: https://postgr.es/m/15900-bc482754fe8d7415%40postgresql.org
Discussion: https://postgr.es/m/15720-38c2b29e5d720187%40postgresql.org
David Rowley [Wed, 3 Jul 2019 11:46:06 +0000 (23:46 +1200)]
Don't remove surplus columns from GROUP BY for inheritance parents
d4c3a156c added code to remove columns that were not part of a table's
PRIMARY KEY constraint from the GROUP BY clause when all the primary key
columns were present in the group by. This is fine to do since we know
that there will only be one row per group coming from this relation.
However, the logic failed to consider inheritance parent relations. These
can have child relations without a primary key, but even if they did, they
could duplicate one of the parent's rows or one from another child
relation. In this case, those additional GROUP BY columns are required.
Fix this by disabling the optimization for inheritance parent tables.
In v11 and beyond, partitioned tables are fine since partitions cannot
overlap and before v11 partitioned tables could not have a primary key.
Reported-by: Manuel Rigger
Discussion: http://postgr.es/m/CA+u7OA7VLKf_vEr6kLF3MnWSA9LToJYncgpNX2tQ-oWzYCBQAw@mail.gmail.com
Backpatch-through: 9.6
Michael Paquier [Tue, 2 Jul 2019 23:58:04 +0000 (08:58 +0900)]
Add support for Visual Studio 2019 in build scripts
This adjusts the documentation and the scripts related to the versions
of Windows SDK supported.
Author: Haribabu Kommi Reviewed-by: Andrew Dunstan, Juan José Santamaría Flecha, Michael
Paquier
Discussion: https://postgr.es/m/CAJrrPGcfqXhfPyMrny9apoDU7M1t59dzVAvoJ9AeAh5BJi+UzA@mail.gmail.com
Backpatch-through: 9.4
Tom Lane [Tue, 2 Jul 2019 17:35:14 +0000 (13:35 -0400)]
Fix tab completion of "SET variable TO|=" to not offer bogus completions.
Don't think that the context "UPDATE tab SET var =" is a GUC-setting
command.
If we have "SET var =" but the "var" is not a known GUC variable,
don't offer any completions. The most likely explanation is that
we've misparsed the context and it's not really a GUC-setting command.
Per gripe from Ken Tanzer. Back-patch to 9.6. The issue exists
further back, but before 9.6 the code looks very different and it
doesn't actually know whether the "var" name matches anything,
so I desisted from trying to fix it.
Don't read fields of a misaligned ExpandedObjectHeader or AnyArrayType.
UBSan complains about this. Instead, cast to a suitable type requiring
only 4-byte alignment. DatumGetAnyArrayP() already assumes one can cast
between AnyArrayType and ArrayType, so this doesn't introduce a new
assumption. Back-patch to 9.5, where AnyArrayType was introduced.
Andrew Gierth [Sun, 30 Jun 2019 22:49:25 +0000 (23:49 +0100)]
Repair logic for reordering grouping sets optimization.
The logic in reorder_grouping_sets to order grouping set elements to
match a pre-specified sort ordering was defective, resulting in
unnecessary sort nodes (though the query output would still be
correct). Repair, simplifying the code a little, and add a test.
Per report from Richard Guo, though I didn't use their patch. Original
bug seems to have been my fault.
Backpatch back to 9.5 where grouping sets were introduced.
Thomas Munro [Thu, 27 Jun 2019 23:11:26 +0000 (11:11 +1200)]
Fix misleading comment in nodeIndexonlyscan.c.
The stated reason for acquiring predicate locks on heap pages hasn't
existed since commit c01262a8, so fix the comment. Perhaps in a later
release we'll also be able to change the code to use tuple locks.
Tomas Vondra [Thu, 27 Jun 2019 16:14:25 +0000 (18:14 +0200)]
Update reference to sampling algorithm in analyze.c
Commit 83e176ec1 moved row sampling functions from analyze.c to
utils/misc/sampling.c, but failed to update comment referring to
the sampling algorithm from Jeff Vitter's paper. Correct the
comment by pointing to utils/misc/sampling.c.
Michael Paquier [Wed, 26 Jun 2019 14:05:06 +0000 (23:05 +0900)]
Add support for OpenSSL 1.1.0 and newer versions in MSVC scripts
Up to now, the MSVC build scripts are able to support only one fixed
version of OpenSSL, and they lacked logic to detect the version of
OpenSSL a given compilation of Postgres is linking to (currently 1.0.2,
the latest LTS of upstream which will be EOL'd at the end of 2019).
This commit adds more logic to detect the version of OpenSSL used by a
build and makes use of it to add support for compilation with OpenSSL
1.1.0 which requires a new set of compilation flags to work properly.
The supported OpenSSL installers have changed their library layer with
various library renames with the upgrade to 1.1.0, making the logic a
bit more complicated. The scripts are now able to adapt to the new
world order.
Reported-by: Sergey Pashkov
Author: Juan José Santamaría Flecha, Michael Paquier Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/15789-8fc75dea3c5a17c8@postgresql.org
Backpatch-through: 9.4
Michael Paquier [Tue, 25 Jun 2019 02:16:04 +0000 (11:16 +0900)]
Fix thinkos in LookupFuncName() for function name lookups
This could trigger valgrind failures when doing ambiguous function name
lookups when no arguments are provided by the caller. The problem has
been introduced in aefeb68, so backpatch to v10. HEAD is fine thanks to
the refactoring done in bfb456c1.
Reported-by: Alexander Lakhin
Author: Alexander Lakhin, Michael Paquier
Discussion: https://postgr.es/m/3d068be5-f617-a5ee-99f6-458a407bfd65@gmail.com
Backpatch-through: 10
Tom Lane [Mon, 24 Jun 2019 20:43:05 +0000 (16:43 -0400)]
Further fix ALTER COLUMN TYPE's handling of indexes and index constraints.
This patch reverts all the code changes of commit e76de8861, which turns
out to have been seriously misguided. We can't wait till later to compute
the definition string for an index; we must capture that before applying
the data type change for any column it depends on, else ruleutils.c will
deliverr wrong/misleading results. (This fine point was documented
nowhere, of course.)
I'd also managed to forget that ATExecAlterColumnType executes once per
ALTER COLUMN TYPE clause, not once per statement; which resulted in the
code being basically completely broken for any case in which multiple ALTER
COLUMN TYPE clauses are applied to a table having non-constraint indexes
that must be rebuilt. Through very bad luck, none of the existing test
cases nor the ones added by e76de8861 caught that, but of course it was
soon found in the field.
The previous patch also had an implicit assumption that if a constraint's
index had a dependency on a table column, so would the constraint --- but
that isn't actually true, so it didn't fix such cases.
Instead of trying to delete unneeded index dependencies later, do the
is-there-a-constraint lookup immediately on seeing an index dependency,
and switch to remembering the constraint if so. In the unusual case of
multiple column dependencies for a constraint index, this will result in
duplicate constraint lookups, but that's not that horrible compared to all
the other work that happens here. Besides, such cases did not work at all
before, so it's hard to argue that they're performance-critical for anyone.
Per bug #15865 from Keith Fiske. As before, back-patch to all supported
branches.
Tom Lane [Sun, 23 Jun 2019 00:31:50 +0000 (20:31 -0400)]
Fix spinlock assembly code for MIPS so it works on MIPS r6.
Original MIPS-I processors didn't have the LL/SC instructions (nor any
other userland synchronization primitive). If the build toolchain
targets that ISA variant by default, as an astonishingly large fraction
of MIPS platforms still do, the assembler won't take LL/SC without
coercion in the form of a ".set mips2" instruction. But we issued that
unconditionally, making it an ISA downgrade for chips later than MIPS2.
That breaks things for the latest MIPS r6 ISA, which encodes these
instructions differently. Adjust the code so we don't change ISA level
if it's >= 2.
Note that this patch doesn't change what happens on an actual MIPS-I
processor: either the kernel will emulate these instructions
transparently, or you'll get a SIGILL failure. That tradeoff seemed
fine in 2002 when this code was added (cf 3cbe6b247), and it's even
more so today when MIPS-I is basically extinct. But let's add a
comment about that.
YunQiang Su (with cosmetic adjustments by me). Back-patch to all
supported branches.
Noah Misch [Sat, 22 Jun 2019 03:34:23 +0000 (20:34 -0700)]
Consolidate methods for translating a Perl path to a Windows path.
This fixes some TAP suites when using msys Perl and a builddir located
in an msys mount point other than "/". For example, builddir=/c/pg
exhibited the problem, since /c/pg falls in mount point "/c".
Back-patch to 9.6, where tests first started to perform such
translations. In back branches, offer both new and old APIs.
Thomas Munro [Thu, 20 Jun 2019 22:57:07 +0000 (10:57 +1200)]
Remove obsolete comments about sempahores from proc.c.
Commit 6753333f switched from a semaphore-based wait to a latch-based
wait for ProcSleep()/ProcWakeup(), but left behind some stray references
to semaphores.
Back-patch to 9.5.
Reviewed-by: Daniel Gustafsson, Michael Paquier
Discussion: https://postgr.es/m/CA+hUKGLs5H6zhmgTijZ1OaJvC1sG0=AFXc1aHuce32tKiQrdEA@mail.gmail.com
Alvaro Herrera [Tue, 18 Jun 2019 22:23:16 +0000 (18:23 -0400)]
Avoid spurious deadlocks when upgrading a tuple lock
This puts back reverted commit de87a084c0a5, with some bug fixes.
When two (or more) transactions are waiting for transaction T1 to release a
tuple-level lock, and transaction T1 upgrades its lock to a higher level, a
spurious deadlock can be reported among the waiting transactions when T1
finishes. The simplest example case seems to be:
T1: select id from job where name = 'a' for key share;
Y: select id from job where name = 'a' for update; -- starts waiting for T1
Z: select id from job where name = 'a' for key share;
T1: update job set name = 'b' where id = 1;
Z: update job set name = 'c' where id = 1; -- starts waiting for T1
T1: rollback;
At this point, transaction Y is rolled back on account of a deadlock: Y
holds the heavyweight tuple lock and is waiting for the Xmax to be released,
while Z holds part of the multixact and tries to acquire the heavyweight
lock (per protocol) and goes to sleep; once T1 releases its part of the
multixact, Z is awakened only to be put back to sleep on the heavyweight
lock that Y is holding while sleeping. Kaboom.
This can be avoided by having Z skip the heavyweight lock acquisition. As
far as I can see, the biggest downside is that if there are multiple Z
transactions, the order in which they resume after T1 finishes is not
guaranteed.
Backpatch to 9.6. The patch applies cleanly on 9.5, but the new tests don't
work there (because isolationtester is not smart enough), so I'm not going
to risk it.
Michael Paquier [Mon, 17 Jun 2019 13:14:09 +0000 (22:14 +0900)]
Fix buffer overflow when processing SCRAM final message in libpq
When a client connects to a rogue server sending specifically-crafted
messages, this can suffice to execute arbitrary code as the operating
system account used by the client.
While on it, fix one error handling when decoding an incorrect salt
included in the first message received from server.
Author: Michael Paquier Reviewed-by: Jonathan Katz, Heikki Linnakangas
Security: CVE-2019-10164
Backpatch-through: 10
Michael Paquier [Mon, 17 Jun 2019 12:48:34 +0000 (21:48 +0900)]
Fix buffer overflow when parsing SCRAM verifiers in backend
Any authenticated user can overflow a stack-based buffer by changing the
user's own password to a purpose-crafted value. This often suffices to
execute arbitrary code as the PostgreSQL operating system account.
This fix is contributed by multiple folks, based on an initial analysis
from Tom Lane. This issue has been introduced by 68e61ee, so it was
possible to make use of it at authentication time. It became more
easily to trigger after ccae190 which has made the SCRAM parsing more
strict when changing a password, in the case where the client passes
down a verifier already hashed using SCRAM. Back-patch to v10 where
SCRAM has been introduced.
Reported-by: Alexander Lakhin
Author: Jonathan Katz, Heikki Linnakangas, Michael Paquier
Security: CVE-2019-10164
Backpatch-through: 10
This code has some tricky corner cases that I'm not sure are correct and
not properly tested anyway, so I'm reverting the whole thing for next
week's releases (reintroducing the deadlock bug that we set to fix).
I'll try again afterwards.
Andrew Gierth [Sat, 15 Jun 2019 17:15:23 +0000 (18:15 +0100)]
Prefer timezone name "UTC" over alternative spellings.
tzdb 2019a made "UCT" a link to the "UTC" zone rather than a separate
zone with its own abbreviation. Unfortunately, our code for choosing a
timezone in initdb has an arbitrary preference for names earlier in
the alphabet, and so it would choose the spelling "UCT" over "UTC"
when the system is running on a UTC zone.
Commit 23bd3cec6 was backpatched in order to address this issue, but
that code helps only when /etc/localtime exists as a symlink, and does
nothing to help on systems where /etc/localtime is a copy of a zone
file (as is the standard setup on FreeBSD and probably some other
platforms too) or when /etc/localtime is simply absent (giving UTC as
the default).
Accordingly, add a preference for the spelling "UTC", such that if
multiple zone names have equally good content matches, we prefer that
name before applying the existing arbitrary rules. Also add a slightly
lower preference for "Etc/UTC"; lower because that preserves the
previous behaviour of choosing the shorter name, but letting us still
choose "Etc/UTC" over "Etc/UCT" when both exist but "UTC" does
not (not common, but I've seen it happen).
Backpatch all the way, because the tzdb change that sparked this issue
is in those branches too.
Tom Lane [Fri, 14 Jun 2019 15:25:13 +0000 (11:25 -0400)]
Attempt to identify system timezone by reading /etc/localtime symlink.
On many modern platforms, /etc/localtime is a symlink to a file within the
IANA database. Reading the symlink lets us find out the name of the system
timezone directly, without going through the brute-force search embodied in
scan_available_timezones(). This shortens the runtime of initdb by some
tens of ms, which is helpful for the buildfarm, and it also allows us to
reliably select the same zone name the system was actually configured for,
rather than possibly choosing one of IANA's many zone aliases. (For
example, in a system configured for "Asia/Tokyo", the brute-force search
would not choose that name but its alias "Japan", on the grounds of the
latter string being shorter. More surprisingly, "Navajo" is preferred
to either "America/Denver" or "US/Mountain", as seen in an old complaint
from Josh Berkus.)
If /etc/localtime doesn't exist, or isn't a symlink, or we can't make
sense of its contents, or the contents match a zone we know but that
zone doesn't match the observed behavior of localtime(), fall back to
the brute-force search.
Also, tweak initdb so that it prints the zone name it selected.
In passing, replace the last few references to the "Olson" database in
code comments with "IANA", as that's been our preferred term since
commit b2cbced9e.
Back-patch of commit 23bd3cec6. The original intention was to not
back-patch, since this can result in cosmetic behavioral changes ---
for example, on my own workstation initdb now chooses "America/New_York",
where it used to prefer "US/Eastern" which is equivalent and shorter.
However, our hand has been more or less forced by tzdb update 2019a,
which made the "UCT" zone fully equivalent to "UTC". Our old code
now prefers "UCT" on the grounds of it being alphabetically first,
and that's making nobody happy. Choosing the alias indicated by
/etc/localtime is a more defensible behavior. (Users who don't like
the results can always force the decision by setting the TZ environment
variable before running initdb.)
Patch by me, per a suggestion from Robert Haas; review by Michael Paquier
Alvaro Herrera [Thu, 13 Jun 2019 21:28:24 +0000 (17:28 -0400)]
Avoid spurious deadlocks when upgrading a tuple lock
When two (or more) transactions are waiting for transaction T1 to release a
tuple-level lock, and transaction T1 upgrades its lock to a higher level, a
spurious deadlock can be reported among the waiting transactions when T1
finishes. The simplest example case seems to be:
T1: select id from job where name = 'a' for key share;
Y: select id from job where name = 'a' for update; -- starts waiting for X
Z: select id from job where name = 'a' for key share;
T1: update job set name = 'b' where id = 1;
Z: update job set name = 'c' where id = 1; -- starts waiting for X
T1: rollback;
At this point, transaction Y is rolled back on account of a deadlock: Y
holds the heavyweight tuple lock and is waiting for the Xmax to be released,
while Z holds part of the multixact and tries to acquire the heavyweight
lock (per protocol) and goes to sleep; once X releases its part of the
multixact, Z is awakened only to be put back to sleep on the heavyweight
lock that Y is holding while sleeping. Kaboom.
This can be avoided by having Z skip the heavyweight lock acquisition. As
far as I can see, the biggest downside is that if there are multiple Z
transactions, the order in which they resume after X finishes is not
guaranteed.
Backpatch to 9.6. The patch applies cleanly on 9.5, but the new tests don't
work there (because isolationtester is not smart enough), so I'm not going
to risk it.
Tom Lane [Thu, 13 Jun 2019 14:53:17 +0000 (10:53 -0400)]
Mark ReplicationSlotCtl as PGDLLIMPORT.
Also MyReplicationSlot, in branches where it wasn't already.
This was discussed in the thread that resulted in c572599c6, but
for some reason nobody pulled the trigger. Now that we have another
request for the same thing, we should just do it.
Etsuro Fujita [Thu, 13 Jun 2019 08:59:12 +0000 (17:59 +0900)]
postgres_fdw: Account for triggers in non-direct remote UPDATE planning.
Previously, in postgresPlanForeignModify, we planned an UPDATE operation
on a foreign table so that we transmit only columns that were explicitly
targets of the UPDATE, so as to avoid unnecessary data transmission, but
if there were BEFORE ROW UPDATE triggers on the foreign table, those
triggers might change values for non-target columns, in which case we
would miss sending changed values for those columns. Prevent optimizing
away transmitting all columns if there are BEFORE ROW UPDATE triggers on
the foreign table.
This is an oversight in commit 7cbe57c34 which added triggers on foreign
tables, so apply the patch all the way back to 9.4 where that came in.
Author: Shohei Mochizuki Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/201905270152.x4R1q3qi014550@toshiba.co.jp
Tom Lane [Thu, 13 Jun 2019 02:54:46 +0000 (22:54 -0400)]
Doc: improve description of allowed spellings for Boolean input.
datatype.sgml failed to explain that boolin() accepts any unique
prefix of the basic input strings. Indeed it was actively misleading
because it called out a few minimal prefixes without mentioning that
there were more valid inputs.
I also felt that it wasn't doing anybody any favors by conflating
SQL key words, valid Boolean input, and string literals containing
valid Boolean input. Rewrite in hopes of reducing the confusion.
Per bug #15836 from Yuming Wang, as diagnosed by David Johnston.
Back-patch to supported branches.
Tom Lane [Wed, 12 Jun 2019 23:42:39 +0000 (19:42 -0400)]
Fix incorrect printing of queries with duplicated join names.
Given a query in which multiple JOIN nodes used the same alias
(which'd necessarily be in different sub-SELECTs), ruleutils.c
would assign the JOIN nodes distinct aliases for clarity ...
but then it forgot to print the modified aliases when dumping
the JOIN nodes themselves. This results in a dump/reload hazard
for views, because the emitted query is flat-out incorrect:
Vars will be printed with table names that have no referent.
This has been wrong for a long time, so back-patch to all supported
branches.
Tom Lane [Wed, 12 Jun 2019 21:29:48 +0000 (17:29 -0400)]
In walreceiver, don't try to do ereport() in a signal handler.
This is quite unsafe, even for the case of ereport(FATAL) where we won't
return control to the interrupted code, and despite this code's use of
a flag to restrict the areas where we'd try to do it. It's possible
for example that we interrupt malloc or free while that's holding a lock
that's meant to protect against cross-thread interference. Then, any
attempt to do malloc or free within ereport() will result in a deadlock,
preventing the walreceiver process from exiting in response to SIGTERM.
We hypothesize that this explains some hard-to-reproduce failures seen
in the buildfarm.
Hence, get rid of the immediate-exit code in WalRcvShutdownHandler,
as well as the logic associated with WalRcvImmediateInterruptOK.
Instead, we need to take care that potentially-blocking operations
in the walreceiver's data transmission logic (libpqwalreceiver.c)
will respond reasonably promptly to the process's latch becoming
set and then call ProcessWalRcvInterrupts. Much of the needed code
for that was already present in libpqwalreceiver.c. I refactored
things a bit so that all the uses of PQgetResult use latch-aware
waiting, but didn't need to do much more.
These changes should be enough to ensure that libpqwalreceiver.c
will respond promptly to SIGTERM whenever it's waiting to receive
data. In principle, it could block for a long time while waiting
to send data too, and this patch does nothing to guard against that.
I think that that hazard is mostly theoretical though: such blocking
should occur only if we fill the kernel's data transmission buffers,
and we don't generally send enough data to make that happen without
waiting for input. If we find out that the hazard isn't just
theoretical, we could fix it by using PQsetnonblocking, but that
would require more ticklish changes than I care to make now.
Back-patch of commit a1a789eb5. This problem goes all the way back
to the origins of walreceiver; but given the substantial reworking
the module received during the v10 cycle, it seems unsafe to assume
that our testing on HEAD validates this patch for pre-v10 branches.
And we'd need to back-patch some prerequisite patches (at least 597a87ccc and its followups, maybe other things), increasing the risk
of problems. Given the dearth of field reports matching this problem,
it's not worth much risk. Hence back-patch to v10 and v11 only.
Tom Lane [Wed, 12 Jun 2019 16:29:24 +0000 (12:29 -0400)]
Fix ALTER COLUMN TYPE failure with a partial exclusion constraint.
ATExecAlterColumnType failed to consider the possibility that an index
that needs to be rebuilt might be a child of a constraint that needs to be
rebuilt. We missed this so far because usually a constraint index doesn't
have a direct dependency on its table, just on the constraint object.
But if there's a WHERE clause, then dependency analysis of the WHERE
clause results in direct dependencies on the column(s) mentioned in WHERE.
This led to trying to drop and rebuild both the constraint and its
underlying index.
In v11/HEAD, we successfully drop both the index and the constraint,
and then try to rebuild both, and of course the second rebuild hits a
duplicate-index-name problem. Before v11, it fails with obscure messages
about a missing relation OID, due to trying to drop the index twice.
This is essentially the same kind of problem noted in commit 20bef2c31: the possible dependency linkages are broader than what
ATExecAlterColumnType was designed for. It was probably OK when
written, but it's certainly been broken since the introduction of
partial exclusion constraints. Fix by adding an explicit check
for whether any of the indexes-to-be-rebuilt belong to any of the
constraints-to-be-rebuilt, and ignoring any that do.
In passing, fix a latent bug introduced by commit 8b08f7d48: in
get_constraint_index() we must "continue" not "break" when rejecting
a relation of a wrong relkind. This is harmless today because we don't
expect that code path to be taken anyway; but if there ever were any
relations to be ignored, the existing coding would have an extremely
undesirable dependency on the order of pg_depend entries.
Also adjust a couple of obsolete comments.
Per bug #15835 from Yaroslav Schekin. Back-patch to all supported
branches.
Michael Paquier [Wed, 12 Jun 2019 02:31:00 +0000 (11:31 +0900)]
Fix handling of COMMENT for domain constraints
For a non-superuser, changing a comment on a domain constraint was
leading to a cache lookup failure as the code tried to perform the
ownership lookup on the constraint OID itself, thinking that it was a
type, but this check needs to happen on the type the domain constraint
relies on. As the type a domain constraint relies on can be guessed
directly based on the constraint OID, first fetch its type OID and
perform the ownership on it.
This is broken since 7eca575, which has split the handling of comments
for table constraints and domain constraints, so back-patch down to
9.5.
Reported-by: Clemens Ladisch
Author: Daniel Gustafsson, Michael Paquier Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/15833-808e11904835d26f@postgresql.org
Backpatch-through: 9.5
David Rowley [Tue, 11 Jun 2019 20:09:28 +0000 (08:09 +1200)]
doc: Add best practises section to partitioning docs
A few questionable partitioning designs have been cropping up lately
around the mailing lists. Generally, these cases have been partitioning
using too many partitions which have caused performance or OOM problems for
the users.
Since we have very little else to guide users into good design, here we
add a new section to the partitioning documentation with some best
practise guidelines for good design.
Tom Lane [Tue, 11 Jun 2019 17:33:08 +0000 (13:33 -0400)]
Fix conversion of JSON strings to JSON output columns in json_to_record().
json_to_record(), when an output column is declared as type json or jsonb,
should emit the corresponding field of the input JSON object. But it got
this slightly wrong when the field is just a string literal: it failed to
escape the contents of the string. That typically resulted in syntax
errors if the string contained any double quotes or backslashes.
jsonb_to_record() handles such cases correctly, but I added corresponding
test cases for it too, to prevent future backsliding.
Improve the documentation, as it provided only a very hand-wavy
description of the conversion rules used by these functions.
Per bug report from Robert Vollmert. Back-patch to v10 where the
error was introduced (by commit cf35346e8).
Note that PG 9.4 - 9.6 also get this case wrong, but differently so:
they feed the de-escaped contents of the string literal to json[b]_in.
That behavior is less obviously wrong, so possibly it's being depended on
in the field, so I won't risk trying to make the older branches behave
like the newer ones.
Andres Freund [Tue, 11 Jun 2019 06:20:48 +0000 (23:20 -0700)]
Don't access catalogs to validate GUCs when not connected to a DB.
Vignesh found this bug in the check function for
default_table_access_method's check hook, but that was just copied
from older GUCs. Investigation by Michael and me then found the bug in
further places.
When not connected to a database (e.g. in a walsender connection), we
cannot perform (most) GUC checks that need database access. Even when
only shared tables are needed, unless they're
nailed (c.f. RelationCacheInitializePhase2()), they cannot be accessed
without pg_class etc. being present.
Fix by extending the existing IsTransactionState() checks to also
check for MyDatabaseOid.
Reported-By: Vignesh C, Michael Paquier, Andres Freund
Author: Vignesh C, Andres Freund
Discussion: https://postgr.es/m/CALDaNm1KXK9gbZfY-p_peRFm_XrBh1OwQO1Kk6Gig0c0fVZ2uw%40mail.gmail.com
Backpatch: 9.4-
Alvaro Herrera [Mon, 10 Jun 2019 22:56:23 +0000 (18:56 -0400)]
Make pg_dump emit ATTACH PARTITION instead of PARTITION OF (reprise)
Using PARTITION OF can result in column ordering being changed from the
database being dumped, if the partition uses a column layout different
from the parent's. It's not pg_dump's job to editorialize on table
definitions, so this is not acceptable; back-patch all the way back to
pg10, where partitioned tables where introduced.
This change also ensures that partitions end up in the correct
tablespace, if different from the parent's; this is an oversight in ca4103025dfe (in pg12 only). Partitioned indexes (in pg11) don't have
this problem, because they're already created as independent indexes and
attached to their parents afterwards.
This change also has the advantage that the partition is restorable from
the dump (as a standalone table) even if its parent table isn't
restored.
The original commits (3b23552ad8bb in branch master) failed to cover
subsidiary column elements correctly, such as NOT NULL constraint and
CHECK constraints, as reported by Rushabh Lathia (initially as a failure
to restore serial columns). They were reverted. This recapitulation
commit fixes those problems.
Add some pg_dump tests to verify these things more exhaustively,
including constraints with legacy-inheritance tables, which were not
tested originally. In branches 10 and 11, add a local constraint to the
pg_dump test partition that was added by commit 2d7eeb1b1492 to master.
Fix operator naming in pg_trgm GUC option descriptions
Descriptions of pg_trgm GUC options have % replaced with %% like it was
a printf-like format. But that's not needed since they are just plain strings.
This commit fixed that. Backpatch to last supported version since this error
present from the beginning.
5871b884 introduced pg_trgm.word_similarity_threshold GUC, but its documentation
contains wrong indentation. This commit fixes that. Backpatch for easier
backpatching of other documentation fixes.
Discussion: https://postgr.es/m/4c735d30-ab59-fc0e-45d8-f90eb5ed3855%402ndquadrant.com
Author: Ian Barwick
Backpatch-through: 9.6
Fix copy-pasto in freeing memory on error in vacuumlo.
It's harmless to call PQfreemem() with a NULL argument, so the only
consequence was that if allocating 'schema' failed, but allocating 'table'
or 'field' succeeded, we would leak a bit of memory. That's highly
unlikely to happen, so this is just academical, but let's get it right.
Per bug #15838 from Timur Birsh. Backpatch back to 9.5, where the
PQfreemem() calls were introduced.
Amit Kapila [Fri, 7 Jun 2019 00:05:31 +0000 (05:35 +0530)]
Fix inconsistency in comments atop ExecParallelEstimate.
When this code was initially introduced in commit d1b7c1ff, the structure
used was SharedPlanStateInstrumentation, but later when it got changed to
Instrumentation structure in commit b287df70, we forgot to update the
comment.
Tom Lane [Mon, 3 Jun 2019 22:06:04 +0000 (18:06 -0400)]
Fix contrib/auto_explain to not cause problems in parallel workers.
A parallel worker process should not be making any decisions of its
own about whether to auto-explain. If the parent session process
passed down flags asking for instrumentation data, do that, otherwise
not. Trying to enable instrumentation anyway leads to bugs like the
"could not find key N in shm TOC" failure reported in bug #15821
from Christian Hofstaedtler.
We can implement this cheaply by piggybacking on the existing logic
for not doing anything when we've chosen not to sample a statement.
While at it, clean up some tin-eared coding related to the sampling
feature, including an off-by-one error that meant that asking for 1.0
sampling rate didn't actually result in sampling every statement.
Although the specific case reported here only manifested in >= v11,
I believe that related misbehaviors can be demonstrated in any version
that has parallel query; and the off-by-one error is certainly there
back to 9.6 where that feature was added. So back-patch to 9.6.
Michael Paquier [Sat, 1 Jun 2019 19:34:02 +0000 (15:34 -0400)]
Fix documentation of check_option in information_schema.views
Support of CHECK OPTION for updatable views has been added in 9.4, but
the documentation of information_schema never got the call even if the
information displayed is correct.
Tomas Vondra [Thu, 30 May 2019 14:16:12 +0000 (16:16 +0200)]
Make error logging in extended statistics more consistent
Most errors reported in extended statistics are internal issues, and so
should use elog(). The MCV list code was already following this rule, but
the functional dependencies and ndistinct coefficients were using a mix
of elog() and ereport(). Fix this by changing most places to elog(), with
the exception of input functions.
This is a mostly cosmetic change, it makes the life a little bit easier
for translators, as elog() messages are not translated. So backpatch to
PostgreSQL 10, where extended statistics were introduced.
Author: Tomas Vondra
Backpatch-through: 10 where extended statistics were added
Discussion: https://postgr.es/m/20190503154404.GA7478@alvherre.pgsql
Noah Misch [Wed, 29 May 2019 02:28:36 +0000 (19:28 -0700)]
MSVC: Add "use File::Path qw(rmtree)".
My back-patch of commit 10b72deafea5972edcafb9eb3f97154f32ccd340 added
calls to File::Path::rmtree(), but v10 and older had not been importing
that symbol. Back-patch to v10, 9.6 and 9.5.
Noah Misch [Tue, 28 May 2019 19:59:00 +0000 (12:59 -0700)]
In the pg_upgrade test suite, don't write to src/test/regress.
When this suite runs installcheck, redirect file creations from
src/test/regress to src/bin/pg_upgrade/tmp_check/regress. This closes a
race condition in "make -j check-world". If the pg_upgrade suite wrote
to a given src/test/regress/results file in parallel with the regular
src/test/regress invocation writing it, a test failed spuriously. Even
without parallelism, in "make -k check-world", the suite finishing
second overwrote the other's regression.diffs. This revealed test
"largeobject" assuming @abs_builddir@ is getcwd(), so fix that, too.
Buildfarm client REL_10, released fifty-four days ago, supports saving
regression.diffs from its new location. When an older client reports a
pg_upgradeCheck failure, it will no longer include regression.diffs.
Back-patch to 9.5, where pg_upgrade moved to src/bin.
Noah Misch [Tue, 28 May 2019 19:58:30 +0000 (12:58 -0700)]
In the pg_upgrade test suite, remove and recreate "tmp_check".
This allows "vcregress upgradecheck" to pass twice in immediate
succession, and it's more like how $(prove_check) works. Back-patch to
9.5, where pg_upgrade moved to src/bin.
Andres Freund [Thu, 23 May 2019 21:46:57 +0000 (14:46 -0700)]
pg_upgrade: Make test.sh's installcheck use to-be-upgraded version's bindir.
On master (after 700538) the old version's installed psql was used -
even when the old version might not actually be installed / might be
installed into a temporary directory. As commonly the case when just
executing make check for pg_upgrade, as $oldbindir is just the current
version's $bindir.
In the back branches, with --install specified, psql from the new
version's temporary installation was used, without --install (e.g for
NO_TEMP_INSTALL, cf 47b3c26642), the new version's installed psql was
used (which might or might not exist).
Author: Andres Freund
Discussion: https://postgr.es/m/20190522175150.c26f4jkqytahajdg@alap3.anarazel.de
Andrew Gierth [Thu, 23 May 2019 14:26:01 +0000 (15:26 +0100)]
Fix array size allocation for HashAggregate hash keys.
When there were duplicate columns in the hash key list, the array
sizes could be miscomputed, resulting in access off the end of the
array. Adjust the computation to ensure the array is always large
enough.
(I considered whether the duplicates could be removed in planning, but
I can't rule out the possibility that duplicate columns might have
different hash functions assigned. Simpler to just make sure it works
at execution time regardless.)
Bug apparently introduced in fc4b3dea2 as part of narrowing down the
tuples stored in the hashtable. Reported by Colm McHugh of Salesforce,
though I didn't use their patch. Backpatch back to version 10 where
the bug was introduced.
Michael Paquier [Thu, 23 May 2019 01:48:29 +0000 (10:48 +0900)]
Fix ordering of GRANT commands in pg_dumpall for tablespaces
This uses a method similar to 68a7c24f and now b8c6014 (applied for
database creation), which guarantees that GRANT commands using the WITH
GRANT OPTION are dumped in a way so as cascading dependencies are
respected. Note that tablespaces do not have support for initial
privileges via pg_init_privs, so the same method needs to be applied
again. It would be nice to merge all the logic generating ACL queries
in dumps under the same banner, but this requires extending the support
of pg_init_privs to objects that cannot use it yet, so this is left as
future work.
Discussion: https://postgr.es/m/20190522071555.GB1278@paquier.xyz
Author: Michael Paquier Reviewed-by: Nathan Bossart
Backpatch-through: 9.6
Michael Paquier [Wed, 22 May 2019 05:48:30 +0000 (14:48 +0900)]
Fix ordering of GRANT commands in pg_dumpall for database creation
This uses a method similar to 68a7c24f, which guarantees that GRANT
commands using the WITH GRANT OPTION are dumped in a way so as cascading
dependencies are respected. As databases do not have support for
initial privileges via pg_init_privs, we need to repeat again the same
ACL reordering method.
ACL for databases have been moved from pg_dumpall to pg_dump in v11, so
this impacts pg_dump for v11 and above, and pg_dumpall for v9.6 and
v10.
Michael Paquier [Mon, 20 May 2019 00:48:37 +0000 (09:48 +0900)]
Fix some grammar in documentation of spgist and pgbench
Discussion: https://postgr.es/m/92961161-9b49-e42f-0a72-d5d47e0ed4de@postgrespro.ru
Author: Liudmila Mantrova Reviewed-by: Jonathan Katz, Tom Lane, Michael Paquier
Backpatch-through: 9.4
Noah Misch [Sun, 19 May 2019 21:36:44 +0000 (14:36 -0700)]
In the pg_upgrade test suite, don't write to src/test/regress.
When this suite runs installcheck, redirect file creations from
src/test/regress to src/bin/pg_upgrade/tmp_check/regress. This closes a
race condition in "make -j check-world". If the pg_upgrade suite wrote
to a given src/test/regress/results file in parallel with the regular
src/test/regress invocation writing it, a test failed spuriously. Even
without parallelism, in "make -k check-world", the suite finishing
second overwrote the other's regression.diffs. This revealed test
"largeobject" assuming @abs_builddir@ is getcwd(), so fix that, too.
Buildfarm client REL_10, released forty-five days ago, supports saving
regression.diffs from its new location. When an older client reports a
pg_upgradeCheck failure, it will no longer include regression.diffs.
Back-patch to 9.5, where pg_upgrade moved to src/bin.
Andres Freund [Tue, 14 May 2019 18:45:40 +0000 (11:45 -0700)]
Add isolation test for INSERT ON CONFLICT speculative insertion failure.
This path previously was not reliably covered. There was some
heuristic coverage via insert-conflict-toast.spec, but that test is
not deterministic, and only tested for a somewhat specific bug.
Backpatch, as this is a complicated and otherwise untested code
path. Unfortunately 9.5 cannot handle two waiting sessions, and thus
cannot execute this test.
Triggered by a conversion with Melanie Plageman.
Author: Andres Freund
Discussion: https://postgr.es/m/CAAKRu_a7hbyrk=wveHYhr4LbcRnRCG=yPUVoQYB9YO1CdUBE9Q@mail.gmail.com
Backpatch: 9.5-
Peter Geoghegan [Mon, 13 May 2019 22:39:03 +0000 (15:39 -0700)]
Doc: Refer to line pointers as item identifiers.
An upcoming HEAD-only patch will standardize the terminology around
ItemIdData variables/line pointers, ending the practice of referring to
them as "item pointers". Make the "Database Page Layout" docs
consistent with the new policy. The term "item identifier" is already
used in the same section, so stick with that.
Discussion: https://postgr.es/m/CAH2-Wz=c=MZQjUzde3o9+2PLAPuHTpVZPPdYxN=E4ndQ2--8ew@mail.gmail.com
Backpatch: All supported branches.
Tom Lane [Mon, 13 May 2019 21:23:00 +0000 (17:23 -0400)]
Fix logical replication's ideas about which type OIDs are built-in.
Only hand-assigned type OIDs should be presumed to match across different
PG servers; those assigned during genbki.pl or during initdb are likely
to change due to addition or removal of unrelated objects.
This means that the cutoff should be FirstGenbkiObjectId (in HEAD)
or FirstBootstrapObjectId (before that), not FirstNormalObjectId.
Compare postgres_fdw's is_builtin() test.
It's likely that this error has no observable consequence in a
normally-functioning system, since ATM the only affected type OIDs are
system catalog rowtypes and information_schema types, which would not
typically be interesting for logical replication. But you could
probably break it if you tried hard, so back-patch.
Tom Lane [Mon, 13 May 2019 14:53:19 +0000 (10:53 -0400)]
Fix misuse of an integer as a bool.
pgtls_read_pending is declared to return bool, but what the underlying
SSL_pending function returns is a count of available bytes.
This is actually somewhat harmless if we're using C99 bools, but in
the back branches it's a live bug: if the available-bytes count happened
to be a multiple of 256, it would get converted to a zero char value.
On machines where char is signed, counts of 128 and up could misbehave
as well. The net effect is that when using SSL, libpq might block
waiting for data even though some has already been received.
Broken by careless refactoring in commit 4e86f1b16, so back-patch
to 9.5 where that came in.
Tom Lane [Sun, 12 May 2019 22:53:13 +0000 (18:53 -0400)]
Fix misoptimization of "{1,1}" quantifiers in regular expressions.
A bounded quantifier with m = n = 1 might be thought a no-op. But
according to our documentation (which traces back to Henry Spencer's
original man page) it still imposes greediness, or non-greediness in the
case of the non-greedy variant "{1,1}?", on whatever it's attached to.
This turns out not to work though, because parseqatom() optimizes away
the m = n = 1 case without regard for whether it's supposed to change
the greediness of the argument RE.
We can fix this by just not applying the optimization when the greediness
needs to change; the subsequent general cases handle it fine.
The three cases in which we can still apply the optimization are
(a) no quantifier, or quantifier does not impose a preference;
(b) atom has no greediness property, implying it cannot match a
variable amount of text anyway; or
(c) quantifier's greediness is same as atom's.
Note that in most cases where one of these applies, we'd have exited
earlier in the "not a messy case" fast path. I think it's now only
possible to get to the optimization when the atom involves capturing
parentheses or a non-top-level backref.
Back-patch to all supported branches. I'd ordinarily be hesitant to
put a subtle behavioral change into back branches, but in this case
it's very hard to see a reason why somebody would write "{1,1}?" unless
they're trying to get the documented change-of-greediness behavior.
Noah Misch [Sun, 12 May 2019 17:33:05 +0000 (10:33 -0700)]
Fail pgwin32_message_to_UTF16() for SQL_ASCII messages.
The function had been interpreting SQL_ASCII messages as UTF8, throwing
an error when they were invalid UTF8. The new behavior is consistent
with pg_do_encoding_conversion(). This affects LOG_DESTINATION_STDERR
and LOG_DESTINATION_EVENTLOG, which will send untranslated bytes to
write() and ReportEventA(). On buildfarm member bowerbird, enabling
log_connections caused an error whenever the role name was not valid
UTF8. Back-patch to 9.4 (all supported versions).
Tom Lane [Sun, 12 May 2019 01:27:13 +0000 (21:27 -0400)]
Rearrange pgstat_bestart() to avoid failures within its critical section.
We long ago decided to design the shared PgBackendStatus data structure to
minimize the cost of writing status updates, which means that writers just
have to increment the st_changecount field twice. That isn't hooked into
any sort of resource management mechanism, which means that if something
were to throw error between the two increments, the st_changecount field
would be left odd indefinitely. That would cause readers to lock up.
Now, since it's also a bad idea to leave the field odd for longer than
absolutely necessary (because readers will spin while we have it set),
the expectation was that we'd treat these segments like spinlock critical
sections, with only short, more or less straight-line, code in them.
That was fine as originally designed, but commit 9029f4b37 broke it
by inserting a significant amount of non-straight-line code into
pgstat_bestart(), code that is very capable of throwing errors, not to
mention taking a significant amount of time during which readers will spin.
We have a report from Neeraj Kumar of readers actually locking up, which
I suspect was due to an encoding conversion error in X509_NAME_to_cstring,
though conceivably it was just a garden-variety OOM failure.
Subsequent commits have loaded even more dubious code into pgstat_bestart's
critical section (and commit fc70a4b0d deserves some kind of booby prize
for managing to miss the critical section entirely, although the negative
consequences seem minimal given that the PgBackendStatus entry should be
seen by readers as inactive at that point).
The right way to fix this mess seems to be to compute all these values
into a local copy of the process' PgBackendStatus struct, and then just
copy the data back within the critical section proper. This plan can't
be implemented completely cleanly because of the struct's heavy reliance
on out-of-line strings, which we must initialize separately within the
critical section. But still, the critical section is far smaller and
safer than it was before.
In hopes of forestalling future errors of the same ilk, rename the
macros for st_changecount management to make it more apparent that
the writer-side macros create a critical section. And to prevent
the worst consequences if we nonetheless manage to mess it up anyway,
adjust those macros so that they really are a critical section, ie
they now bump CritSectionCount. That doesn't add much overhead, and
it guarantees that if we do somehow throw an error while the counter
is odd, it will lead to PANIC and a database restart to reset shared
memory.
Back-patch to 9.5 where the problem was introduced.
In HEAD, also fix an oversight in commit b0b39f72b: it failed to teach
pgstat_read_current_status to copy st_gssstatus data from shared memory to
local memory. Hence, subsequent use of that data within the transaction
would potentially see changing data that it shouldn't see.
Noah Misch [Sat, 11 May 2019 07:22:38 +0000 (00:22 -0700)]
Honor TEMP_CONFIG in TAP suites.
The buildfarm client uses TEMP_CONFIG to implement its extra_config
setting. Except for stats_temp_directory, extra_config now applies to
TAP suites; extra_config values seen in the past month are compatible
with this. Back-patch to 9.6, where PostgresNode was introduced, so the
buildfarm can rely on it sooner.
Tom Lane [Fri, 10 May 2019 18:56:41 +0000 (14:56 -0400)]
Cope with EINVAL and EIDRM shmat() failures in PGSharedMemoryAttach.
There's a very old race condition in our code to see whether a pre-existing
shared memory segment is still in use by a conflicting postmaster: it's
possible for the other postmaster to remove the segment in between our
shmctl() and shmat() calls. It's a narrow window, and there's no risk
unless both postmasters are using the same port number, but that's possible
during parallelized "make check" tests. (Note that while the TAP tests
take some pains to choose a randomized port number, pg_regress doesn't.)
If it does happen, we treated that as an unexpected case and errored out.
To fix, allow EINVAL to be treated as segment-not-present, and the same
for EIDRM on Linux. AFAICS, the considerations here are basically
identical to the checks for acceptable shmctl() failures, so I documented
and coded it that way.
While at it, adjust PGSharedMemoryAttach's API to remove its undocumented
dependency on UsedShmemSegAddr in favor of passing the attach address
explicitly. This makes it easier to be sure we're using a null shmaddr
when probing for segment conflicts (thus avoiding questions about what
EINVAL means). I don't think there was a bug there, but it required
fragile assumptions about the state of UsedShmemSegAddr during
PGSharedMemoryIsInUse.
Commit c09850992 may have made this failure more probable by applying
the conflicting-segment tests more often. Hence, back-patch to all
supported branches, as that was.
Tom Lane [Thu, 9 May 2019 20:52:49 +0000 (16:52 -0400)]
Repair issues with faulty generation of merge-append plans.
create_merge_append_plan failed to honor the CP_EXACT_TLIST flag:
it would generate the expected targetlist but then it felt free to
add resjunk sort targets to it. This demonstrably leads to assertion
failures in v11 and HEAD, and it's probably just accidental that we
don't see the same in older branches. I've not looked into whether
there would be any real-world consequences in non-assert builds.
In HEAD, create_append_plan has sprouted the same problem, so fix
that too (although we do not have any test cases that seem able to
reach that bug). This is an oversight in commit 3fc6e2d7f which
invented the CP_EXACT_TLIST flag, so back-patch to 9.6 where that
came in.
convert_subquery_pathkeys would create pathkeys for subquery output
values if they match any EquivalenceClass known in the outer query
and are available in the subquery's syntactic targetlist. However,
the second part of that condition is wrong, because such values might
not appear in the subquery relation's reltarget list, which would
mean that they couldn't be accessed above the level of the subquery
scan. We must check that they appear in the reltarget list, instead.
This can lead to dropping knowledge about the subquery's sort
ordering, but I believe it's okay, because any sort key that the
outer query actually has any interest in would appear in the
reltarget list.
This second issue is of very long standing, but right now there's no
evidence that it causes observable problems before 9.6, so I refrained
from back-patching further than that. We can revisit that choice if
somebody finds a way to make it cause problems in older branches.
(Developing useful test cases for these issues is really problematic;
fixing convert_subquery_pathkeys removes the only known way to exhibit
the create_merge_append_plan bug, and neither of the test cases added
by this patch causes a problem in all branches, even when considering
the issues separately.)
The second issue explains bug #15795 from Suresh Kumar R ("could not
find pathkey item to sort" with nested DISTINCT queries). I stumbled
across the first issue while investigating that.
Michael Paquier [Thu, 9 May 2019 01:29:40 +0000 (10:29 +0900)]
Fix error status of vacuumdb when multiple jobs are used
When running a batch of VACUUM or ANALYZE commands on a given database,
there were cases where it is possible to have vacuumdb not report an
error where it actually should, leading to incorrect status results.
Author: Julien Rouhaud Reviewed-by: Amit Kapila, Michael Paquier
Discussion: https://postgr.es/m/CAOBaU_ZuTwz7CtqLYJ1Ouuh272bTQPLN8b1bAPk0bCBm4PDMTQ@mail.gmail.com
Backpatch-through: 9.5
Fujii Masao [Wed, 8 May 2019 16:35:13 +0000 (01:35 +0900)]
Fix documentation for the privileges required for replication functions.
Previously it's documented that use of replication functions is
restricted to superusers. This is true for the functions which
use replication origin, but not for pg_logicl_emit_message() and
functions which use replication slot. For example, not only
superusers but also users with REPLICATION privilege is allowed
to use the functions for replication slot. This commit fixes
the documentation for the privileges required for those replication
functions.
Thomas Munro [Mon, 6 May 2019 03:02:41 +0000 (15:02 +1200)]
Probe only 127.0.0.1 when looking for ports on Unix.
Commit c0985099, later adjusted by commit 4ab02e81, probed 0.0.0.0
in addition to 127.0.0.1, for the benefit of Windows build farm
animals. It isn't really useful on Unix systems, and turned out to
be a bit inconvenient to users of some corporate firewall software.
Switch back to probing just 127.0.0.1 on non-Windows systems.
Michael Paquier [Tue, 7 May 2019 05:20:01 +0000 (14:20 +0900)]
Remove some code related to 7.3 and older servers from tools of src/bin/
This code was broken as of 582edc3, and is most likely not used anymore.
Note that pg_dump supports servers down to 8.0, and psql has code to
support servers down to 7.4.
Author: Julien Rouhaud Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAOBaU_Y5y=zo3+2gf+2NJC1pvMYPcbRXoQaPXx=U7+C8Qh4CzQ@mail.gmail.com
Alvaro Herrera [Mon, 6 May 2019 16:23:49 +0000 (12:23 -0400)]
Revert "Make pg_dump emit ATTACH PARTITION instead of PARTITION OF"
... and fallout (from branches 10, 11 and master). The change was
ill-considered, and it broke a few normal use cases; since we don't have
time to fix it, we'll try again after this week's minor releases.
Dean Rasheed [Mon, 6 May 2019 10:58:32 +0000 (11:58 +0100)]
Use checkAsUser for selectivity estimator checks, if it's set.
In examine_variable() and examine_simple_variable(), when checking the
user's table and column privileges to determine whether to grant
access to the pg_statistic data, use checkAsUser for the privilege
checks, if it's set. This will be the case if we're accessing the
table via a view, to indicate that we should perform privilege checks
as the view owner rather than the current user.
This change makes this planner check consistent with the check in the
executor, so the planner will be able to make use of statistics if the
table is accessible via the view. This fixes a performance regression
introduced by commit e2d4ef8de8, which affects queries against
non-security barrier views in the case where the user doesn't have
privileges on the underlying table, but the view owner does.
Note that it continues to provide the same safeguards controlling
access to pg_statistic for direct table access (in which case
checkAsUser won't be set) and for security barrier views, because of
the nearby checks on rte->security_barrier and rte->securityQuals.
Back-patch to all supported branches because e2d4ef8de8 was.
Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.
Dean Rasheed [Mon, 6 May 2019 10:43:09 +0000 (11:43 +0100)]
Fix security checks for selectivity estimation functions with RLS.
In commit e2d4ef8de8, security checks were added to prevent
user-supplied operators from running over data from pg_statistic
unless the user has table or column privileges on the table, or the
operator is leakproof. For a table with RLS, however, checking for
table or column privileges is insufficient, since that does not
guarantee that the user has permission to view all of the column's
data.
Fix this by also checking for securityQuals on the RTE, and insisting
that the operator be leakproof if there are any. Thus the
leakproofness check will only be skipped if there are no securityQuals
and the user has table or column privileges on the table -- i.e., only
if we know that the user has access to all the data in the column.
Back-patch to 9.5 where RLS was added.
Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.
Andres Freund [Mon, 6 May 2019 06:31:58 +0000 (23:31 -0700)]
Remove reindex_catalog test from test schedules.
As the test currently causes occasional deadlocks (due to the schema
cleanup from previous sessions potentially still running), and the
patch from f912d7dec2 has gotten a fair bit of buildfarm coverage,
remove the test from the test schedules. There's a set of minor
releases coming up.
Leave the tests in place, so it can manually be run using EXTRA_TESTS.
For now also leave it in master, as there's no imminent release, and
there's plenty (re-)index related work in 12. But we'll have to
disable it before long there too, unless somebody comes up with simple
enough fixes for the deadlock (I'm about to post a vague idea to the
list).
Discussion: https://postgr.es/m/4622.1556982247@sss.pgh.pa.us
Backpatch: 9.4-11 (no master!)