Tom Lane [Tue, 23 Apr 2019 22:51:31 +0000 (18:51 -0400)]
Repair assorted issues in locale data extraction.
cache_locale_time (extraction of LC_TIME-related info) had never been
taught the lessons we previously learned about extraction of info related
to LC_MONETARY and LC_NUMERIC. Specifically, commit 95a777c61 taught
PGLC_localeconv() that data coming out of localeconv() was in an encoding
determined by the relevant locale, but we didn't realize that there's a
similar issue with strftime(). And commit a4930e7ca hardened
PGLC_localeconv() against errors occurring partway through, but failed
to do likewise for cache_locale_time(). So, rearrange the latter
function to perform encoding conversion and not risk failure while
it's got the locales set to temporary values.
This time around I also changed PGLC_localeconv() to treat it as FATAL
if it can't restore the previous settings of the locale values. There
is no reason (except possibly OOM) for that to fail, and proceeding with
the wrong locale values seems like a seriously bad idea --- especially
on Windows where we have to also temporarily change LC_CTYPE. Also,
protect against the possibility that we can't identify the codeset
reported for LC_MONETARY or LC_NUMERIC; rather than just failing,
try to validate the data without conversion.
The user-visible symptom this fixes is that if LC_TIME is set to a locale
name that implies an encoding different from the database encoding,
non-ASCII localized day and month names would be retrieved in the wrong
encoding, leading to either unexpected encoding-conversion error reports
or wrong output from to_char(). The other possible failure modes are
unlikely enough that we've not seen reports of them, AFAIK.
The encoding conversion problems do not manifest on Windows, since
we'd already created special-case code to handle that issue there.
Per report from Juan José Santamaría Flecha. Back-patch to all
supported versions.
Michael Paquier [Tue, 23 Apr 2019 06:43:32 +0000 (15:43 +0900)]
Fix detection of passwords hashed with MD5 or SCRAM-SHA-256
This commit fixes a couple of issues related to the way password
verifiers hashed with MD5 or SCRAM-SHA-256 are detected, leading to
being able to store in catalogs passwords which do not follow the
supported hash formats:
- A MD5-hashed entry was checked based on if its header uses "md5" and
if the string length matches what is expected. Unfortunately the code
never checked if the hash only used hexadecimal characters, as reported
by Tom Lane.
- A SCRAM-hashed entry was checked based on only its header, which
should be "SCRAM-SHA-256$", but it never checked for any fields
afterwards, as reported by Jonathan Katz.
Backpatch down to v10, which is where SCRAM has been introduced, and
where password verifiers in plain format have been removed.
Author: Jonathan Katz Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/016deb6b-1f0a-8e9f-1833-a8675b170aa9@postgresql.org
Backpatch-through: 10
Fix documentation of pg_start_backup and pg_stop_backup functions.
This commit adds the description that "non-exclusive" pg_start_backup
and pg_stop_backup can be executed even during recovery. Previously
it was wrongly documented that those functions are not allowed to be
executed during recovery.
Back-patch to 9.6 where non-exclusive backup API was added.
Tom Lane [Fri, 19 Apr 2019 15:20:37 +0000 (11:20 -0400)]
Fix problems with auto-held portals.
HoldPinnedPortals() did things in the wrong order: it must not mark
a portal autoHeld until it's been successfully held. Otherwise,
a failure while persisting the portal results in a server crash
because we think the portal is in a good state when it's not.
Also add a check that portal->status is READY before attempting to
hold a pinned portal. We have such a check before the only other
use of HoldPortal(), so it seems unwise not to check it here.
Lastly, rethink the responsibility for where to call HoldPinnedPortals.
The comment for it imagined that it was optional for any individual PL
to call it or not, but that cannot be the case: if some outer level of
procedure has a pinned portal, failing to persist it when an inner
procedure commits is going to be trouble. Let's have SPI do it instead
of the individual PLs. That's not a complete solution, since in theory
a PL might not be using SPI to perform commit/rollback, but such a PL
is going to have to be aware of lots of related requirements anyway.
(This change doesn't cause an API break for any external PLs that might
be calling HoldPinnedPortals per the old regime, because calling it
twice during a commit or rollback sequence won't hurt.)
Per bug #15703 from Julian Schauder. Back-patch to v11 where this code
came in.
Peter Eisentraut [Tue, 16 Apr 2019 08:37:44 +0000 (10:37 +0200)]
Fix handling of temp and unlogged tables in FOR ALL TABLES publications
If a FOR ALL TABLES publication exists, temporary and unlogged tables
are ignored for publishing changes. But CheckCmdReplicaIdentity()
would still check in that case that such a table has a replica
identity set before accepting updates. To fix, have
GetRelationPublicationActions() return that such a table publishes no
actions.
Bruce Momjian [Wed, 17 Apr 2019 22:12:10 +0000 (18:12 -0400)]
postgresql.conf.sample: add proper defaults for include actions
Previously, include actions include_dir, include_if_exists, and include
listed commented-out values which were not the defaults, which is
inconsistent with other entries. Instead, replace them with '', which
is the default value.
Bruce Momjian [Wed, 17 Apr 2019 22:01:02 +0000 (18:01 -0400)]
docs: clarify pg_upgrade's recovery behavior
The previous paragraph trying to explain --check, --link, and no --link
modes and the various points of failure was too complex. Instead, use
bullet lists and sublists.
Reported-by: Daniel Gustafsson
Discussion: https://postgr.es/m/qtqiv7hI87s_Xvz5ZXHCaH-1-_AZGpIDJowzlRjF3-AbCr3RhSNydM_JCuJ8DE4WZozrtxhIWmyYTbv0syKyfGB6cYMQitp9yN-NZMm-oAo=@yesql.se
Tom Lane [Wed, 17 Apr 2019 21:30:29 +0000 (17:30 -0400)]
Fix unportable code in pgbench.
The buildfarm points out that UINT64_FORMAT might not work with sscanf;
it's calibrated for our printf implementation, which might not agree
with the platform-supplied sscanf. Fall back to just accepting an
unsigned long, which is already more than the documentation promises.
Oversight in e6c3ba7fb; back-patch to v11, as that was.
Fix division by zero in _bt_vacuum_needs_cleanup()
Checks inside _bt_vacuum_needs_cleanup() allow division by zero to happen when
metad->btm_last_cleanup_num_heap_tuples == 0. This commit adjusts the
expression so that no division by zero might happen.
Reported-by: Piotr Stefaniak
Discussion: https://postgr.es/m/DB8PR03MB5931C41F7787A95313F08322F22A0%40DB8PR03MB5931.eurprd03.prod.outlook.com Reviewed-by: Masahiko Sawada
Backpatch-through: 11
Michael Paquier [Mon, 15 Apr 2019 03:34:51 +0000 (12:34 +0900)]
Fix SHOW ALL command for non-superusers with replication connection
Since Postgres 10, SHOW commands can be triggered with replication
connections in a WAL sender context, however it missed that a
transaction context is needed for syscache lookups. This commit makes
sure that the syscache lookups can happen correctly by setting a
transaction context when running SHOW commands in a WAL sender.
Superuser-only parameters can be displayed using SHOW commands not only
to superusers, but also to members of system role pg_read_all_settings,
which requires a syscache lookup to check if the connected role is a
member of this system role or not, or the instance crashes. Superusers
do not need to check the syscache so it worked correctly in this case.
New tests are added to cover this issue.
Reported-by: Alexander Kukushkin
Author: Michael Paquier Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/15734-2daa8761eeed8e20@postgresql.org
Backpatch-through: 10
Test both 0.0.0.0 and 127.0.0.x addresses to find a usable port.
Commit c098509927f9a49ebceb301a2cb6a477ecd4ac3c changed
PostgresNode::get_new_node() to probe 0.0.0.0 instead of 127.0.0.1, but
the new test was less effective for Windows native Perl. This increased
the failure rate of buildfarm members bowerbird and jacana. Instead,
test 0.0.0.0 and concrete addresses. This restores the old level of
defense, but the algorithm is still subject to its longstanding time of
check to time of use race condition. Back-patch to 9.6, like the
previous change.
Tom Lane [Sat, 13 Apr 2019 17:22:26 +0000 (13:22 -0400)]
Prevent memory leaks associated with relcache rd_partcheck structures.
The original coding of generate_partition_qual() just copied the list
of predicate expressions into the global CacheMemoryContext, making it
effectively impossible to clean up when the owning relcache entry is
destroyed --- the relevant code in RelationDestroyRelation() only managed
to free the topmost List header :-(. This resulted in a session-lifespan
memory leak whenever a table partition's relcache entry is rebuilt.
Fortunately, that's not normally a large data structure, and rebuilds
shouldn't occur all that often in production situations; but this is
still a bug worth fixing back to v10 where the code was introduced.
To fix, put the cached expression tree into its own small memory context,
as we do with other complicated substructures of relcache entries.
Also, deal more honestly with the case that a partition has an empty
partcheck list; while that probably isn't a case that's very interesting
for production use, it's legal.
In passing, clarify comments about how partitioning-related relcache
data structures are managed, and add some Asserts that we're not leaking
old copies when we overwrite these data fields.
postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process. When the postmaster.pid file was
missing, a starting postmaster used weaker checks. Change to use the
same checks in both scenarios. This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster
will no longer stop if shmat() of an old segment fails with EACCES. A
postmaster will no longer recycle segments pertaining to other data
directories. That's good for production, but it's bad for integration
tests that crash a postmaster and immediately delete its data directory.
Such a test now leaks a segment indefinitely. No "make check-world"
test does that. win32_shmem.c already avoided all these problems. In
9.6 and later, enhance PostgresNode to facilitate testing. Back-patch
to 9.4 (all supported versions).
Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI.
Tom Lane [Wed, 10 Apr 2019 23:02:21 +0000 (19:02 -0400)]
Fix backwards test in operator_precedence_warning logic.
Warnings about unary minus might have been wrong. It's a bit
surprising that nobody noticed yet ... probably the precedence-warning
feature hasn't really been used much in the field.
Amit Kapila [Wed, 10 Apr 2019 03:06:42 +0000 (08:36 +0530)]
Avoid counting transaction stats for parallel worker cooperating
transaction.
The transaction that is initiated by the parallel worker to cooperate
with the actual transaction started by the main backend to complete the
query execution should not be counted as a separate transaction. The
other internal transactions started and committed by the parallel worker
are still counted as separate transactions as we that is what we do in
other places like autovacuum.
This will partially fix the bloat in transaction stats due to additional
transactions performed by parallel workers. For a complete fix, we need to
decide how we want to show all the transactions that are started internally
for various operations and that is a matter of separate patch.
Reported-by: Haribabu Kommi
Author: Haribabu Kommi Reviewed-by: Amit Kapila, Jamison Kirk and Rahila Syed
Backpatch-through: 9.6
Discussion: https://postgr.es/m/CAJrrPGc9=jKXuScvNyQ+VNhO0FZk7LLAShAJRyZjnedd2D61EQ@mail.gmail.com
Avoid "could not reattach" by providing space for concurrent allocation.
We've long had reports of intermittent "could not reattach to shared
memory" errors on Windows. Buildfarm member dory fails that way when
PGSharedMemoryReAttach() execution overlaps with creation of a thread
for the process's "default thread pool". Fix that by providing a second
region to receive asynchronous allocations that would otherwise intrude
into UsedShmemSegAddr. In pgwin32_ReserveSharedMemoryRegion(), stop
trying to free reservations landing at incorrect addresses; the caller's
next step has been to terminate the affected process. Back-patch to 9.4
(all supported versions).
Tom Lane [Mon, 8 Apr 2019 20:09:07 +0000 (16:09 -0400)]
Fix improper interaction of FULL JOINs with lateral references.
join_is_legal() needs to reject forming certain outer joins in cases
where that would lead the planner down a blind alley. However, it
mistakenly supposed that the way to handle full joins was to treat them
as applying the same constraints as for left joins, only to both sides.
That doesn't work, as shown in bug #15741 from Anthony Skorski: given
a lateral reference out of a join that's fully enclosed by a full join,
the code would fail to believe that any join ordering is legal, resulting
in errors like "failed to build any N-way joins".
However, we don't really need to consider full joins at all for this
purpose, because we effectively force them to be evaluated in syntactic
order, and that order is always legal for lateral references. Hence,
get rid of this broken logic for full joins and just ignore them instead.
This seems to have been an oversight in commit 7e19db0c0.
Back-patch to all supported branches, as that was.
Tom Lane [Mon, 8 Apr 2019 16:20:22 +0000 (12:20 -0400)]
Fix EvalPlanQualStart to handle partitioned result rels correctly.
The es_root_result_relations array needs to be shallow-copied in the
same way as the main es_result_relations array, else EPQ rechecks on
partitioned result relations fail, as seen in bug #15677 from
Norbert Benkocs.
Michael Paquier [Mon, 8 Apr 2019 04:45:14 +0000 (13:45 +0900)]
Fix partition tuple routing with dropped attributes
When trying to insert a tuple into a partitioned table, the routing to
the correct partition has been messed up by mixing when a tuple needs to
be stored in an intermediate parent's slot and when a tuple needs to be
converted because of attribute changes between the immediate parent
relation and the parent relation one level above that (the grandparent).
This could trigger errors like the following:
ERROR: cannot extract attribute from empty tuple slot SQL state: XX000
This was not detected because regression tests with dropped attributes
only included tests with two levels of partitioning, and this can be
triggered with three levels or more.
This fixes bug #15733, which has been introduced by 34295b8. The bug
happens only on REL_11_STABLE and HEAD gains the regression tests added
for this bug.
Reported-by: Petr Fedorov
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/15733-7692379e310b80ec@postgresql.org
Tom Lane [Sun, 7 Apr 2019 22:18:59 +0000 (18:18 -0400)]
Avoid fetching past the end of the indoption array.
pg_get_indexdef_worker carelessly fetched indoption entries even for
non-key index columns that don't have one. 99.999% of the time this
would be harmless, since the code wouldn't examine the value ... but
some fine day this will be a fetch off the end of memory, resulting
in SIGSEGV.
Detected through valgrind testing. Odd that the buildfarm's valgrind
critters haven't noticed.
Before those commits, partitioning-related code in the executor could
assume that ModifyTableState.resultRelInfo[] contains only leaf partitions.
However, now a fully-pruned update results in a dummy ModifyTable that
references the root partitioned table, and that breaks some stuff.
In v11, this led to an assertion or core dump in the tuple routing code.
Fix by disabling tuple routing, since we don't need that anyway.
(I chose to do that in HEAD as well for safety, even though the problem
doesn't manifest in HEAD as it stands.)
In v10, this confused ExecInitModifyTable's decision about whether it
needed to close the root table. But we can get rid of that altogether
by being smarter about where to find the root table.
Note that since the referenced commits haven't shipped yet, this
isn't fixing any bug the field has seen.
Tom Lane [Sat, 6 Apr 2019 19:09:10 +0000 (15:09 -0400)]
Fix failures in validateForeignKeyConstraint's slow path.
The foreign-key-checking loop in ATRewriteTables failed to ignore
relations without storage (e.g., partitioned tables), unlike the
initial loop. This accidentally worked as long as RI_Initial_Check
succeeded, which it does in most practical cases (including all the
ones exercised in the existing regression tests :-(). However, if
that failed, as for instance when there are permissions issues,
then we entered the slow fire-the-trigger-on-each-tuple path.
And that would try to read from the referencing relation, and fail
if it lacks storage.
A second problem, recently introduced in HEAD, was that this loop
had been broken by sloppy refactoring for the tableam API changes.
Repair both issues, and add a regression test case so we have some
coverage on this code path. Back-patch as needed to v11.
(It looks like this code could do with additional bulletproofing,
but let's get a working test case in place first.)
Doc: Update documentation on partitioning vs. foreign tables.
The limitations that it is not allowed to create/attach a foreign table
as a partition of an indexed partitioned table were not documented.
Reported-By: Stepan Yankevych
Author: Etsuro Fujita Reviewed-By: Amit Langote
Backpatch-through: 11 where partitioned index was introduced
Discussion: https://postgr.es/m/1553869152.858391073.5f8m3n0x@frv53.fwdcdn.com
Michael Paquier [Fri, 5 Apr 2019 01:38:21 +0000 (10:38 +0900)]
Fix some documentation in pg_rewind
Since 11, it is possible to use a non-superuser role when using an
online source cluster with pg_rewind as long as the role has proper
permissions to execute on the source all the functions used by
pg_rewind, and the documentation stated that a superuser is necessary.
Let's add at the same time all the details needed to create such a
role.
A second confusion which comes a lot from users is that it is necessary
to issue a checkpoint on a freshly-promoted standby so as its control
file has up-to-date timeline information which is used by pg_rewind to
validate the operation. Let's document that properly. This is
back-patched down to 9.5 where pg_rewind has been introduced.
Author: Michael Paquier Reviewed-by: Magnus Hagander
Discussion: https://postgr.es/m/CABUevEz5bpvbwVsYCaSMV80CBZ5-82nkMzbb+Bu=h1m=rLdn=g@mail.gmail.com
Backpatch-through: 9.5
Make src/test/recovery/t/017_shm.pl safe for concurrent execution.
Buildfarm members idiacanthus and komodoensis, which share a host, both
executed this test in the same second. That failed. Back-patch to 9.6,
where the test first appeared.
Handle USE_MODULE_DB for all tests able to use an installed postmaster.
When $(MODULES) and $(MODULE_big) are empty, derive the database name
from the first element of $(REGRESS) instead of using a constant string.
When deriving the database name from $(MODULES), use its first element
instead of the entire list; the earlier approach would fail if any
multi-module directory had $(REGRESS) tests. Treat isolation suites and
src/pl correspondingly. Under USE_MODULE_DB=1, installcheck-world and
check-world no longer reuse any database name in a given postmaster.
Buildfarm members axolotl, mandrill and frogfish saw spurious "is being
accessed by other users" failures that would not have happened without
database name reuse. (The CountOtherDBBackends() 5s deadline expired
during DROP DATABASE; a backend for an earlier test suite had used the
same database name and had not yet exited.) Back-patch to 9.4 (all
supported versions), except bits pertaining to isolation suites.
Concept reviewed by Andrew Dunstan, Andres Freund and Tom Lane.
postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process. When the postmaster.pid file was
missing, a starting postmaster used weaker checks. Change to use the
same checks in both scenarios. This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster
will no longer recycle segments pertaining to other data directories.
That's good for production, but it's bad for integration tests that
crash a postmaster and immediately delete its data directory. Such a
test now leaks a segment indefinitely. No "make check-world" test does
that. win32_shmem.c already avoided all these problems. In 9.6 and
later, enhance PostgresNode to facilitate testing. Back-patch to 9.4
(all supported versions).
Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI.
Dean Rasheed [Tue, 2 Apr 2019 07:17:04 +0000 (08:17 +0100)]
Perform RLS subquery checks as the right user when going via a view.
When accessing a table with RLS via a view, the RLS checks are
performed as the view owner. However, the code neglected to propagate
that to any subqueries in the RLS checks. Fix that by calling
setRuleCheckAsUser() for all RLS policy quals and withCheckOption
checks for RTEs with RLS.
One should almost always terminate an old process, not use a manual
removal tool like ipcrm. Removal of the ipcclean script eleven years
ago (39627b1ae680cba44f6e56ca5facec4fdbfe9495) and its non-replacement
corroborate that manual shm removal is now a niche goal. Back-patch to
9.4 (all supported versions).
Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI.
Tom Lane [Sat, 30 Mar 2019 16:48:19 +0000 (12:48 -0400)]
Avoid crash in partitionwise join planning under GEQO.
While trying to plan a partitionwise join, we may be faced with cases
where one or both input partitions for a particular segment of the join
have been pruned away. In HEAD and v11, this is problematic because
earlier processing didn't bother to make a pruned RelOptInfo fully
valid. With an upcoming patch to make partition pruning more efficient,
this'll be even more problematic because said RelOptInfo won't exist at
all.
The existing code attempts to deal with this by retroactively making the
RelOptInfo fully valid, but that causes crashes under GEQO because join
planning is done in a short-lived memory context. In v11 we could
probably have fixed this by switching to the planner's main context
while fixing up the RelOptInfo, but that idea doesn't scale well to the
upcoming patch. It would be better not to mess with the base-relation
data structures during join planning, anyway --- that's just a recipe
for order-of-operations bugs.
In many cases, though, we don't actually need the child RelOptInfo,
because if the input is certainly empty then the join segment's result
is certainly empty, so we can skip making a join plan altogether. (The
existing code ultimately arrives at the same conclusion, but only after
doing a lot more work.) This approach works except when the pruned-away
partition is on the nullable side of a LEFT, ANTI, or FULL join, and the
other side isn't pruned. But in those cases the existing code leaves a
lot to be desired anyway --- the correct output is just the result of
the unpruned side of the join, but we were emitting a useless outer join
against a dummy Result. Pending somebody writing code to handle that
more nicely, let's just abandon the partitionwise-join optimization in
such cases.
When the modified code skips making a join plan, it doesn't make a
join RelOptInfo either; this requires some upper-level code to
cope with nulls in part_rels[] arrays. We would have had to have
that anyway after the upcoming patch.
Back-patch to v11 since the crash is demonstrable there.
Thomas Munro [Wed, 27 Mar 2019 08:16:50 +0000 (21:16 +1300)]
Fix off-by-one error in txid_status().
The transaction ID returned by GetNextXidAndEpoch() is in the future,
so we can't attempt to access its status or we might try to read a
CLOG page that doesn't exist. The > vs >= confusion probably stemmed
from the choice of a variable name containing the word "last" instead
of "next", so fix that too.
Back-patch to 10 where the function arrived.
Author: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKG%2Buua_BV5cyfsioKVN2d61Lukg28ECsWTXKvh%3DBtN2DPA%40mail.gmail.com
Tomas Vondra [Wed, 27 Mar 2019 01:39:39 +0000 (02:39 +0100)]
Track unowned relations in doubly-linked list
Relations dropped in a single transaction are tracked in a list of
unowned relations. With large number of dropped relations this resulted
in poor performance at the end of a transaction, when the relations are
removed from the singly linked list one by one.
Commit b4166911 attempted to address this issue (particularly when it
happens during recovery) by removing the relations in a reverse order,
resulting in O(1) lookups in the list of unowned relations. This did
not work reliably, though, and it was possible to trigger the O(N^2)
behavior in various ways.
Instead of trying to remove the relations in a specific order with
respect to the linked list, which seems rather fragile, switch to a
regular doubly linked. That allows us to remove relations cheaply no
matter where in the list they are.
As b4166911 was a bugfix, backpatched to all supported versions, do the
same thing here.
Alvaro Herrera [Tue, 26 Mar 2019 23:19:39 +0000 (20:19 -0300)]
Fix partitioned index creation bug with dropped columns
ALTER INDEX .. ATTACH PARTITION fails if the partitioned table where the
index is defined contains more dropped columns than its partition, with
this message:
ERROR: incorrect attribute map
The cause was that one caller of CompareIndexInfo was passing the number
of attributes of the partition rather than the parent, which confused
the length check. Repair.
This can cause pg_upgrade to fail when used on such a database. Leave
some more objects around after regression tests, so that the case is
detected by pg_upgrade test suite.
Remove some spurious empty lines noticed while looking for other cases
of the same problem.
Tom Lane [Sun, 24 Mar 2019 19:13:21 +0000 (15:13 -0400)]
Avoid double-free in vacuumlo error path.
The code would do "PQclear(res)" twice if lo_unlink failed, evidently
due to careless thinking about how far out a "break" would break.
Remove the extra PQclear and adjust the loop logic so that we'll fall
out of both levels of loop after an error, as was clearly the intent.
Spotted by Coverity. I have no idea why it took this long to notice,
since the bug has been there since commit 67ccbb080. Accordingly,
back-patch to all supported branches.
Fix WAL format incompatibility introduced by backpatching of 52ac6cd2d0
52ac6cd2d0 added new field to ginxlogDeletePage and was backpatched to 9.4.
That led to problems when patched postgres instance applies WAL records
generated by non-patched one. WAL records generated by non-patched instance
don't contain new field, which patched one is expecting to see.
Thankfully, we can distinguish patched and non-patched WAL records by their data
size. If we see that WAL record is generated by non-patched instance, we skip
processing of new field. This commit comes with some assertions. In
particular, if it appears that on some platform struct data size didn't change
then static assertion will trigger.
Reported-by: Simon Riggs
Discussion: https://postgr.es/m/CANP8%2Bj%2BK4whxf7ET7%2BgO%2BG-baC3-WxqqH%3DnV4X2CgfEPA3Yu3g%40mail.gmail.com
Author: Alexander Korotkov Reviewed-by: Simon Riggs, Alvaro Herrera
Backpatch-through: 9.4
Michael Paquier [Sun, 24 Mar 2019 12:01:10 +0000 (21:01 +0900)]
Make current_logfiles use permissions assigned to files in data directory
Since its introduction in 19dc233c, current_logfiles has been assigned
the same permissions as a log file, which can be enforced with
log_file_mode. This setup can lead to incompatibility problems with
group access permissions as current_logfiles is not located in the log
directory, but at the root of the data folder. Hence, if group
permissions are used but log_file_mode is more restrictive, a backup
with a user in the group having read access could fail even if the log
directory is located outside of the data folder.
Per discussion with the folks mentioned below, we have concluded that
current_logfiles should not be treated as a log file as it only stores
metadata related to log files, and that it should use the same
permissions as all other files in the data directory. This solution has
the merit to be simple and fixes all the interaction problems between
group access and log_file_mode.
Author: Haribabu Kommi Reviewed-by: Stephen Frost, Robert Haas, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAJrrPGcEotF1P7AWoeQyD3Pqr-0xkQg_Herv98DjbaMj+naozw@mail.gmail.com
Backpatch-through: 11, where group access has been added.
Tom Lane [Sat, 23 Mar 2019 20:24:30 +0000 (16:24 -0400)]
Accept XML documents when xmloption = content, as required by SQL:2006+.
Previously we were using the SQL:2003 definition, which doesn't allow
this, but that creates a serious dump/restore gotcha: there is no
setting of xmloption that will allow all valid XML data. Hence,
switch to the 2006 definition.
Since libxml doesn't accept <!DOCTYPE> directives in the mode we
use for CONTENT parsing, the implementation is to detect <!DOCTYPE>
in the input and switch to DOCUMENT parsing mode. This should not
cost much, because <!DOCTYPE> should be close to the front of the
input if it's there at all. It's possible that this causes the
error messages for malformed input to be slightly different than
they were before, if said input includes <!DOCTYPE>; but that does
not seem like a big problem.
In passing, buy back a few cycles in parsing of large XML documents
by not doing strlen() of the whole input in parse_xml_decl().
Back-patch because dump/restore failures are not nice. This change
shouldn't break any cases that worked before, so it seems safe to
back-patch.
Alvaro Herrera [Wed, 20 Mar 2019 20:23:26 +0000 (17:23 -0300)]
Restore RI trigger sanity check
I unnecessarily removed this check in 3de241dba86f because I
misunderstood what the final representation of constraints across a
partitioning hierarchy was to be. Put it back (in both branches).
Tom Lane [Tue, 19 Mar 2019 20:58:20 +0000 (16:58 -0400)]
Hack back-branch SSL tests to avoid intermittent buildfarm failures.
Buildfarm member eelpout sometimes reports the wrong error message for
an SSL connection failure. In HEAD, this problem is believed to be
solved by commit 1f39a1c06, but I'm as yet unwilling to back-patch that.
The problem seems fairly unlikely to be an issue in the field, since (as
far as we can tell) it happens only during a failure of a local-loopback
SSL connection, and it's improbable even then. It seems better to just
live with it for the time being; but let's tweak the regression test to
accept the other error message as a "pass".
Needed in v11 only, since older branches didn't check the message
text anyway.
Tom Lane [Tue, 19 Mar 2019 16:49:27 +0000 (12:49 -0400)]
Make checkpoint requests more robust.
Commit 6f6a6d8b1 introduced a delay of up to 2 seconds if we're trying
to request a checkpoint but the checkpointer hasn't started yet (or,
much less likely, our kill() call fails). However buildfarm experience
shows that that's not quite enough for slow or heavily-loaded machines.
There's no good reason to assume that the checkpointer won't start
eventually, so we may as well make the timeout much longer, say 60 sec.
However, if the caller didn't say CHECKPOINT_WAIT, it seems like a bad
idea to be waiting at all, much less for as long as 60 sec. We can
remove the need for that, and make this whole thing more robust, by
adjusting the code so that the existence of a pending checkpoint
request is clear from the contents of shared memory, and making sure
that the checkpointer process will notice it at startup even if it did
not get a signal. In this way there's no need for a non-CHECKPOINT_WAIT
call to wait at all; if it can't send the signal, it can nonetheless
assume that the checkpointer will eventually service the request.
A potential downside of this change is that "kill -INT" on the checkpointer
process is no longer enough to trigger a checkpoint, should anyone be
relying on something so hacky. But there's no obvious reason to do it
like that rather than issuing a plain old CHECKPOINT command, so we'll
assume that nobody is. There doesn't seem to be a way to preserve this
undocumented quasi-feature without introducing race conditions.
Since a principal reason for messing with this is to prevent intermittent
buildfarm failures, back-patch to all supported branches.
Tom Lane [Mon, 18 Mar 2019 21:54:24 +0000 (17:54 -0400)]
Fix memory leak in printtup.c.
Commit f2dec34e1 changed things so that printtup's output stringinfo
buffer was allocated outside the per-row temporary context, not inside
it. This creates a need to free that buffer explicitly when the temp
context is freed, but that was overlooked. In most cases, this is all
happening inside a portal or executor context that will go away shortly
anyhow, but that's not always true. Notably, the stringinfo ends up
getting leaked when JDBC uses row-at-a-time fetches. For a query
that returns wide rows, that adds up after awhile.
Per bug #15700 from Matthias Otterbach. Back-patch to v11 where the
faulty code was added.
We should try to prewarm each database only once. Otherwise, if
prewarming fails for some reason, it will just keep retrying in an
infnite loop. This can happen if, for example, the database has been
dropped. The existing code was intended to implement the try-once
behavior, but failed to do so because it neglected to set
worker.bgw_restart_time to BGW_NEVER_RESTART.
Michael Paquier [Mon, 18 Mar 2019 01:35:01 +0000 (10:35 +0900)]
Fix pg_rewind when rewinding new database with tables included
This fixes an issue introduced by 266b6ac, which has added filters to
exclude file patterns on the target and source data directories to
reduce the number of files transferred. Filters get applied to both
the target and source data files, and include pg_internal.init which is
present for each database once relations are created on it. However, if
the target differed from the source with at least one new database with
relations, the rewind would fail due to the exclusion filters applied on
the target files, causing pg_internal.init to still be present on the
target database folder, while its contents should have been completely
removed so as there is nothing remaining inside at the time of the
folder deletion.
Applying exclusion filters on the source files is fine, because this way
the amount of data copied from the source to the target is reduced. And
actually, not applying the filters on the target is what pg_rewind
should do, because this causes such files to be automatically removed
during the rewind on the target. Exclusion filters apply to paths which
are removed or recreated automatically at startup, so removing all those
files on the target during the rewind is a win.
The existing set of TAP tests already stresses the rewind of databases,
but it did not include any tables on those newly-created databases.
Creating extra tables in this case is enough to reproduce the failure,
so the existing tests are extended to close the gap.
Reported-by: Mithun Cy
Author: Michael Paquier
Discussion: https://postgr.es/m/CADq3xVYt6_pO7ZzmjOqPgY9HWsL=kLd-_tNyMtdfjKqEALDyTA@mail.gmail.com
Backpatch-through: 11
Michael Paquier [Mon, 18 Mar 2019 00:12:24 +0000 (09:12 +0900)]
Error out in pg_verify_checksums on incompatible block size
pg_verify_checksums is compiled with a given block size and has a hard
dependency to it per the way checksums are calculated via
checksum_impl.h, and trying to use the tool on a data folder which has
not the same block size would result in incorrect checksum calculations
and/or block read errors, meaning that the data folder is corrupted.
This is harmless as checksums are only checked now, but very confusing
for the user so issue an error properly if the block size used at
compilation and the block size used in the data folder do not match.
Reported-by: Sergei Kornilov
Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Magnus Hagander
Discussion: https://postgr.es/m/20190317054657.GA3357@paquier.xyz
ackpatch-through: 11
Peter Eisentraut [Thu, 14 Mar 2019 07:25:25 +0000 (08:25 +0100)]
Fix volatile vs. pointer confusion
Variables used after a longjmp() need to be declared volatile. In
case of a pointer, it's the pointer itself that needs to be declared
volatile, not the pointed-to value. So we need
Tom Lane [Thu, 14 Mar 2019 16:16:09 +0000 (12:16 -0400)]
Ensure dummy paths have correct required_outer if rel is parameterized.
The assertions added by commits 34ea1ab7f et al found another problem:
set_dummy_rel_pathlist and mark_dummy_rel were failing to label
the dummy paths they create with the correct outer_relids, in case
the relation is necessarily parameterized due to having lateral
references in its tlist. It's likely that this has no user-visible
consequences in production builds, at the moment; but still an assertion
failure is a bad thing, so back-patch the fix.
Per bug #15694 from Roman Zharkov (via Alexander Lakhin)
and an independent report by Tushar Ahuja.
Michael Paquier [Thu, 14 Mar 2019 05:15:13 +0000 (14:15 +0900)]
Fix thinko when bumping on temporary directories in pg_verify_checksums
This fixes an oversight from 5c99513. This has no actual consequence as
PG_TEMP_FILE_PREFIX and PG_TEMP_FILES_DIR have the same value so when
bumping on a temporary path the directory scan was still moving on to
the next entry instead of skipping the rest of the scan, but let's keep
the logic correct.
Author: Michael Banck Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20190314.115417.58230569.horiguchi.kyotaro@lab.ntt.co.jp
Backpatch-through: 11
Michael Paquier [Wed, 13 Mar 2019 00:51:25 +0000 (09:51 +0900)]
Fix cross-version compatibility checks of pg_verify_checksums
pg_verify_checksums performs a read of the control file, and the data it
fetches should be from a data folder compatible with the major version
of Postgres the binary has been compiled with, but we never actually
checked that compatibility.
Reported-by: Sergei Kornilov
Author: Michael Paquier Reviewed-by: Sergei Kornilov
Discussion: https://postgr.es/m/155231347133.16480.11453587097036807558.pgcf@coridan.postgresql.org
Backpatch-through: 11
Etsuro Fujita [Tue, 12 Mar 2019 07:32:27 +0000 (16:32 +0900)]
Fix testing of parallel-safety of scan/join target.
In commit 960df2a971 ("Correctly assess parallel-safety of tlists when
SRFs are used."), the testing of scan/join target was done incorrectly,
which caused a plan-quality problem. Backpatch through to v11 where
the aforementioned commit went in, since this is a regression from v10.
Author: Etsuro Fujita Reviewed-by: Robert Haas and Tom Lane
Discussion: https://postgr.es/m/5C75303E.8020303@lab.ntt.co.jp
Alvaro Herrera [Sun, 10 Mar 2019 22:45:29 +0000 (19:45 -0300)]
Fix documentation on partitioning vs. foreign tables
1. The PARTITION OF clause of CREATE FOREIGN TABLE was not explained in
the CREATE FOREIGN TABLE reference page. Add it.
(Postgres 10 onwards)
2. The limitation that tuple routing cannot target partitions that are
foreign tables was not documented clearly enough. Improve wording.
(Postgres 10 onwards)
3. The UPDATE tuple re-routing concurrency behavior was explained in
the DDL chapter, which doesn't seem the right place. Move it to the
UPDATE reference page instead. (Postgres 11 onwards).
Authors: Amit Langote, David Rowley. Reviewed-by: Etsuro Fujita. Reported-by: Derek Hans
Discussion: https://postgr.es/m/CAGrP7a3Xc1Qy_B2WJcgAD8uQTS_NDcJn06O5mtS_Ne1nYhBsyw@mail.gmail.com
Tom Lane [Sun, 10 Mar 2019 16:58:52 +0000 (12:58 -0400)]
Disallow NaN as a value for floating-point GUCs.
None of the code that uses GUC values is really prepared for them to
hold NaN, but parse_real() didn't have any defense against accepting
such a value. Treat it the same as a syntax error.
I haven't attempted to analyze the exact consequences of setting any
of the float GUCs to NaN, but since they're quite unlikely to be good,
this seems like a back-patchable bug fix.
Note: we don't need an explicit test for +-Infinity because those will
be rejected by existing range checks. I added a regression test for
that in HEAD, but not older branches because the spelling of the value
in the error message will be platform-dependent in branches where we
don't always use port/snprintf.c.
Michael Paquier [Fri, 8 Mar 2019 06:10:31 +0000 (15:10 +0900)]
Fix function signatures of pageinspect in documentation
tuple_data_split() lacked the type of the first argument, and
heap_page_item_attrs() has reversed the first and second argument,
with the bytea argument using an incorrect name.
Tom Lane [Thu, 7 Mar 2019 19:21:52 +0000 (14:21 -0500)]
Fix handling of targetlist SRFs when scan/join relation is known empty.
When we introduced separate ProjectSetPath nodes for application of
set-returning functions in v10, we inadvertently broke some cases where
we're supposed to recognize that the result of a subquery is known to be
empty (contain zero rows). That's because IS_DUMMY_REL was just looking
for a childless AppendPath without allowing for a ProjectSetPath being
possibly stuck on top. In itself, this didn't do anything much worse
than produce slightly worse plans for some corner cases.
Then in v11, commit 11cf92f6e rearranged things to allow the scan/join
targetlist to be applied directly to partial paths before they get
gathered. But it inserted a short-circuit path for dummy relations
that was a little too short: it failed to insert a ProjectSetPath node
at all for a targetlist containing set-returning functions, resulting in
bogus "set-valued function called in context that cannot accept a set"
errors, as reported in bug #15669 from Madelaine Thibaut.
The best way to fix this mess seems to be to reimplement IS_DUMMY_REL
so that it drills down through any ProjectSetPath nodes that might be
there (and it seems like we'd better allow for ProjectionPath as well).
While we're at it, make it look at rel->pathlist not cheapest_total_path,
so that it gives the right answer independently of whether set_cheapest
has been done lately. That dependency looks pretty shaky in the context
of code like apply_scanjoin_target_to_paths, and even if it's not broken
today it'd certainly bite us at some point. (Nastily, unsafe use of the
old coding would almost always work; the hazard comes down to possibly
looking through a dangling pointer, and only once in a blue moon would
you find something there that resulted in the wrong answer.)
It now looks like it was a mistake for IS_DUMMY_REL to be a macro: if
there are any extensions using it, they'll continue to use the old
inadequate logic until they're recompiled, after which they'll fail
to load into server versions predating this fix. Hopefully there are
few such extensions.
Having fixed IS_DUMMY_REL, the special path for dummy rels in
apply_scanjoin_target_to_paths is unnecessary as well as being wrong,
so we can just drop it.
Also change a few places that were testing for partitioned-ness of a
planner relation but not using IS_PARTITIONED_REL for the purpose; that
seems unsafe as well as inconsistent, plus it required an ugly hack in
apply_scanjoin_target_to_paths.
In passing, save a few cycles in apply_scanjoin_target_to_paths by
skipping processing of pre-existing paths for partitioned rels,
and do some cosmetic cleanup and comment adjustment in that function.
I renamed IS_DUMMY_PATH to IS_DUMMY_APPEND with the intention of breaking
any code that might be using it, since in almost every case that would
be wrong; IS_DUMMY_REL is what to be using instead.
In HEAD, also make set_dummy_rel_pathlist static (since it's no longer
used from outside allpaths.c), and delete is_dummy_plan, since it's no
longer used anywhere.
Andrew Dunstan [Mon, 4 Mar 2019 22:11:18 +0000 (17:11 -0500)]
Disable dump_connstr test on Msys2
For some reason the dump test with names with high bits set fails on
Msys2 (although not Msys1). Disable the tests for now, so that other
tests can run.
Andrew Dunstan [Tue, 5 Mar 2019 15:46:21 +0000 (10:46 -0500)]
Fix pgbench TAP test failure with funky file names (redux)
This test fails if the containing directory contains a funny character
such as a space or some perl metacharacter. To avoid that, we check for
files names using readdir and a regex, rather than using a glob pattern.
Michael Paquier [Mon, 4 Mar 2019 00:50:02 +0000 (09:50 +0900)]
Fix error handling of readdir() port implementation on first file lookup
The implementation of readdir() in src/port/ which gets used by MSVC has
been added in 399a36a, and since the beginning it considers all errors
on the first file lookup as ENOENT, setting errno accordingly and
letting the routine caller think that the directory is empty. While
this is normally enough for the case of the backend, this can confuse
callers of this routine on Windows as all errors would map to the same
behavior. So, for example, even permission errors would be thought as
having an empty directory, while there could be contents in it.
This commit changes the error handling so as readdir() gets a behavior
similar to native implementations: force errno=0 when seeing
ERROR_FILE_NOT_FOUND as error and consider other errors as plain
failures.
While looking at the patch, I noticed that MinGW does not enforce
errno=0 when looking at the first file, but it gets enforced on the next
file lookups. A comment related to that was incorrect in the code.
Reported-by: Yuri Kurenkov Diagnosed-by: Yuri Kurenkov, Grigory Smolkin
Author: Konstantin Knizhnik Reviewed-by: Andrew Dunstan, Michael Paquier
Discussion: https://postgr.es/m/2cad7829-8d66-e39c-b937-ac825db5203d@postgrespro.ru
Backpatch-through: 9.4
Dean Rasheed [Sun, 3 Mar 2019 10:52:54 +0000 (10:52 +0000)]
Further fixing for multi-row VALUES lists for updatable views.
Previously, rewriteTargetListIU() generated a list of attribute
numbers from the targetlist, which were passed to rewriteValuesRTE(),
which expected them to contain the same number of entries as there are
columns in the VALUES RTE, and to be in the same order. That was fine
when the target relation was a table, but for an updatable view it
could be broken in at least three different ways ---
rewriteTargetListIU() could insert additional targetlist entries for
view columns with defaults, the view columns could be in a different
order from the columns of the underlying base relation, and targetlist
entries could be merged together when assigning to elements of an
array or composite type. As a result, when recursing to the base
relation, the list of attribute numbers generated from the rewritten
targetlist could no longer be relied upon to match the columns of the
VALUES RTE. We got away with that prior to 41531e42d3 because it used
to always be the case that rewriteValuesRTE() did nothing for the
underlying base relation, since all DEFAULTS had already been replaced
when it was initially invoked for the view, but that was incorrect
because it failed to apply defaults from the base relation.
Fix this by examining the targetlist entries more carefully and
picking out just those that are simple Vars referencing the VALUES
RTE. That's sufficient for the purposes of rewriteValuesRTE(), which
is only responsible for dealing with DEFAULT items in the VALUES
RTE. Any DEFAULT item in the VALUES RTE that doesn't have a matching
simple-Var-assignment in the targetlist is an error which we complain
about, but in theory that ought to be impossible.
Additionally, move this code into rewriteValuesRTE() to give a clearer
separation of concerns between the 2 functions. There is no need for
rewriteTargetListIU() to know about the details of the VALUES RTE.
While at it, fix the comment for rewriteValuesRTE() which claimed that
it doesn't support array element and field assignments --- that hasn't
been true since a3c7a993d5 (9.6 and later).
Back-patch to all supported versions, with minor differences for the
pre-9.6 branches, which don't support array element and field
assignments to the same column in multi-row VALUES lists.
Michael Paquier [Thu, 28 Feb 2019 00:40:39 +0000 (09:40 +0900)]
Fix SCRAM authentication via SSL when mixing versions of OpenSSL
When using a libpq client linked with OpenSSL 1.0.1 or older to connect
to a backend linked with OpenSSL 1.0.2 or newer, the server would send
SCRAM-SHA-256-PLUS and SCRAM-SHA-256 as valid mechanisms for the SASL
exchange, and the client would choose SCRAM-SHA-256-PLUS even if it does
not support channel binding, leading to a confusing error. In this
case, what the client ought to do is switch to SCRAM-SHA-256 so as the
authentication can move on and succeed.
So for a SCRAM authentication over SSL, here are all the cases present
and how we deal with them using libpq:
1) Server supports channel binding, it sends SCRAM-SHA-256-PLUS and
SCRAM-SHA-256 as allowed mechanisms.
1-1) Client supports channel binding, chooses SCRAM-SHA-256-PLUS.
1-2) Client does not support channel binding, chooses SCRAM-SHA-256.
2) Server does not support channel binding, sends SCRAM-SHA-256 as
allowed mechanism.
2-1) Client supports channel binding, still it has no choice but to
choose SCRAM-SHA-256.
2-2) Client does not support channel binding, it chooses SCRAM-SHA-256.
In all these scenarios the connection should succeed, and the one which
was handled incorrectly prior this commit is 1-2), causing the
connection attempt to fail because client chose SCRAM-SHA-256-PLUS over
SCRAM-SHA-256.
Reported-by: Hugh Ranalli Diagnosed-by: Peter Eisentraut
Author: Michael Paquier Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/CAAhbUMO89SqUk-5mMY+OapgWf-twF2NA5sCucbHEzMfGbvcepA@mail.gmail.com
Backpatch-through: 11
Thomas Munro [Sun, 24 Feb 2019 21:54:12 +0000 (10:54 +1300)]
Fix inconsistent out-of-memory error reporting in dsa.c.
Commit 16be2fd1 introduced the flag DSA_ALLOC_NO_OOM to control whether
the DSA allocator would raise an error or return InvalidDsaPointer on
failure to allocate. One edge case was not handled correctly: if we
fail to allocate an internal "span" object for a large allocation, we
would always return InvalidDsaPointer regardless of the flag; a caller
not expecting that could then dereference a null pointer.
This is a plausible explanation for a one-off report of a segfault.
Remove a redundant pair of braces so that all three stanzas that handle
DSA_ALLOC_NO_OOM match in style, for visual consistency.
While fixing inconsistencies, if FreePageManagerGet() can't supply the
pages that our book-keeping says it should be able to supply, then we
should always report a FATAL error. Previously we treated that as a
regular allocation failure in one code path, but as a FATAL condition
in another.
Back-patch to 10, where dsa.c landed.
Author: Thomas Munro Reported-by: Jakub Glapa
Discussion: https://postgr.es/m/CAEepm=2oPqXxyWQ-1o60tpOLrwkw=VpgNXqqF1VN2EyO9zKGQw@mail.gmail.com
Tom Lane [Sun, 24 Feb 2019 17:51:50 +0000 (12:51 -0500)]
Fix ecpg bugs caused by missing semicolons in the backend grammar.
The Bison documentation clearly states that a semicolon is required
after every grammar rule, and our scripts that generate ecpg's
grammar from the backend's implicitly assumed this is true. But it
turns out that only ancient versions of Bison actually enforce that.
There have been a couple of rules without trailing semicolons in
gram.y for some time, and as a consequence, ecpg's grammar was faulty
and produced wrong output for the affected statements.
To fix, add the missing semis, and add some cross-checks to ecpg's
scripts so that they'll bleat if we mess this up again.
The cases that were broken were:
* "SET variable = DEFAULT" (but not "SET variable TO DEFAULT"),
as well as allied syntaxes such as ALTER SYSTEM SET ... DEFAULT.
These produced syntactically invalid output that the server
would reject.
* Multiple type names in DROP TYPE/DOMAIN commands. Only the
first type name would be listed in the emitted command.
Per report from Daisuke Higuchi. Back-patch to all supported versions.
Thomas Munro [Sun, 24 Feb 2019 10:48:52 +0000 (23:48 +1300)]
Tolerate EINVAL when calling fsync() on a directory.
Previously, we tolerated EBADF as a way for the operating system to
indicate that it doesn't support fsync() on a directory. Tolerate
EINVAL too, for older versions of Linux CIFS.
Bug #15636. Back-patch all the way.
Reported-by: John Klann
Discussion: https://postgr.es/m/15636-d380890dafd78fc6@postgresql.org
Thomas Munro [Sun, 24 Feb 2019 00:38:15 +0000 (13:38 +1300)]
Tolerate ENOSYS failure from sync_file_range().
One unintended consequence of commit 9ccdd7f6 was that Windows WSL
users started getting a panic whenever we tried to initiate data
flushing with sync_file_range(), because WSL does not implement that
system call. Previously, they got a stream of periodic warnings,
which was also undesirable but at least ignorable.
Prevent the panic by handling ENOSYS specially and skipping the panic
promotion with data_sync_elevel(). Also suppress future attempts
after the first such failure so that the pre-existing problem of
noisy warnings is improved.
Back-patch to 9.6 (older branches were not affected in this way by 9ccdd7f6).
Author: Thomas Munro and James Sewell Tested-by: James Sewell Reported-by: Bruce Klein
Discussion: https://postgr.es/m/CA+mCpegfOUph2U4ZADtQT16dfbkjjYNJL1bSTWErsazaFjQW9A@mail.gmail.com
Tom Lane [Fri, 22 Feb 2019 17:23:00 +0000 (12:23 -0500)]
Fix plan created for inherited UPDATE/DELETE with all tables excluded.
In the case where inheritance_planner() finds that every table has
been excluded by constraints, it thought it could get away with
making a plan consisting of just a dummy Result node. While certainly
there's no updating or deleting to be done, this had two user-visible
problems: the plan did not report the correct set of output columns
when a RETURNING clause was present, and if there were any
statement-level triggers that should be fired, it didn't fire them.
Hence, rather than only generating the dummy Result, we need to
stick a valid ModifyTable node on top, which requires a tad more
effort here.
It's been broken this way for as long as inheritance_planner() has
known about deleting excluded subplans at all (cf commit 635d42e9c),
so back-patch to all supported branches.
Amit Langote and Tom Lane, per a report from Petr Fedorov.
Peter Eisentraut [Thu, 21 Feb 2019 14:39:37 +0000 (15:39 +0100)]
Fix dbtoepub output file name
In previous releases, the input file of dbtoepub was postgres.xml, and
dbtoepub knows to derive the output file name postgres.epub from that
automatically. But now the intput file is postgres.sgml (since
postgres.sgml is itself an XML file and we no longer need the
intermediate postgres.xml file), but dbtoepub doesn't know how to deal
with the .sgml suffix, so the automatically derived output file name
becomes postgres.sgml.epub. Fix by adding an explicit -o option.
Tom Lane [Thu, 21 Feb 2019 01:53:08 +0000 (20:53 -0500)]
Speed up match_eclasses_to_foreign_key_col() when there are many ECs.
Check ec_relids before bothering to iterate through the EC members.
On a perhaps extreme, but still real-world, query in which
match_eclasses_to_foreign_key_col() accounts for the bulk of the
planner's runtime, this saves nearly 40% of the runtime. It's a bit
of a stopgap fix, but it's simple enough to be back-patched to 9.6
where this code came in; so let's do that.
Tom Lane [Wed, 20 Feb 2019 18:36:55 +0000 (13:36 -0500)]
Fix incorrect strictness test for ArrayCoerceExpr expressions.
The recursion in contain_nonstrict_functions_walker() was done wrong,
causing the strictness check to be bypassed for a parse node that
is the immediate input of an ArrayCoerceExpr node. This could allow,
for example, incorrect decisions about whether a strict SQL function
can be inlined.
I didn't add a regression test, because (a) the bug is so narrow
and (b) I couldn't think of a test case that wasn't dependent on a
large number of other behaviors, to the point where it would likely
soon rot to the point of not testing what it was intended to.
I broke this in commit c12d570fa, so back-patch to v11.
Dean Rasheed [Wed, 20 Feb 2019 08:28:42 +0000 (08:28 +0000)]
Fix DEFAULT-handling in multi-row VALUES lists for updatable views.
INSERT ... VALUES for a single VALUES row is implemented differently
from a multi-row VALUES list, which causes inconsistent behaviour in
the way that DEFAULT items are handled. In particular, when inserting
into an auto-updatable view on top of a table with a column default, a
DEFAULT item in a single VALUES row gets correctly replaced with the
table column's default, but for a multi-row VALUES list it is replaced
with NULL.
Fix this by allowing rewriteValuesRTE() to leave DEFAULT items in the
VALUES list untouched if the target relation is an auto-updatable view
and has no column default, deferring DEFAULT-expansion until the query
against the base relation is rewritten. For all other types of target
relation, including tables and trigger- and rule-updatable views, we
must continue to replace DEFAULT items with NULL in the absence of a
column default.
This is somewhat complicated by the fact that if an auto-updatable
view has DO ALSO rules attached, the VALUES lists for the product
queries need to be handled differently from the original query, since
the product queries need to act like rule-updatable views whereas the
original query has auto-updatable view semantics.
Back-patch to all supported versions.
Reported by Roger Curley (bug #15623). Patch by Amit Langote and me.
Michael Paquier [Wed, 20 Feb 2019 03:31:27 +0000 (12:31 +0900)]
Mark correctly initial slot snapshots with MVCC type when built
When building an initial slot snapshot, snapshots are marked with
historic MVCC snapshots as type with the marker field being set in
SnapBuildBuildSnapshot() but not overriden in SnapBuildInitialSnapshot().
Existing callers of SnapBuildBuildSnapshot() do not care about the type
of snapshot used, but extensions calling it actually may, as reported.
Michael Paquier [Mon, 18 Feb 2019 05:23:44 +0000 (14:23 +0900)]
Fix some issues with TAP tests of pg_basebackup
ee9e145 has fixed the tests of pg_basebackup for checksums a first time,
still one seek() call missed the shot. Also, the data written in files
to emulate corruptions was not actually writing zeros as the quoting
style was incorrect.
Author: Michael Banck
Discussion: https://postgr.es/m/1550153276.796.35.camel@credativ.de
Backpatch-through: 11
Thomas Munro [Sun, 17 Feb 2019 20:53:26 +0000 (09:53 +1300)]
Fix race in dsm_unpin_segment() when handles are reused.
Teach dsm_unpin_segment() to skip segments that are in the process
of being destroyed by another backend, when searching for a handle.
Such a segment cannot possibly be the one we are looking for, even
if its handle matches. Another slot might hold a recently created
segment that has the same handle value by coincidence, and we need
to keep searching for that one.
The bug caused rare "cannot unpin a segment that is not pinned"
errors on 10 and 11. Similar to commit 6c0fb941 for dsm_attach().
Back-patch to 10, where dsm_unpin_segment() landed.
Author: Thomas Munro Reported-by: Justin Pryzby Tested-by: Justin Pryzby (along with other recent DSA/DSM fixes)
Discussion: https://postgr.es/m/20190216023854.GF30291@telsasoft.com
Joe Conway [Sun, 17 Feb 2019 18:15:06 +0000 (13:15 -0500)]
Fix documentation for dblink_error_message() return value
The dblink documentation claims that an empty string is returned if there
has been no error, however OK is actually returned in that case. Also,
clarify that an async error may not be seen unless dblink_is_busy() or
dblink_get_result() have been called first.
Tom Lane [Sun, 17 Feb 2019 17:37:31 +0000 (12:37 -0500)]
Fix CREATE VIEW to allow zero-column views.
We should logically have allowed this case when we allowed zero-column
tables, but it was overlooked.
Although this might be thought a feature addition, it's really a bug
fix, because it was possible to create a zero-column view via
the convert-table-to-view code path, and then you'd have a situation
where dump/reload would fail. Hence, back-patch to all supported
branches.
Arrange the added test cases to provide coverage of the related
pg_dump code paths (since these views will be dumped and reloaded
during the pg_upgrade regression test). I also made them test
the case where pg_dump has to postpone the view rule into post-data,
which disturbingly had no regression coverage before.
Report and patch by Ashutosh Sharma (test case by me)
Tatsuo Ishii [Sun, 17 Feb 2019 11:35:09 +0000 (20:35 +0900)]
Doc: remove ancient comment.
There's a very old comment in rules.sgml added back to 2003. It
expected to a feature coming back but it never happened. So now we can
safely remove the comment. Back-patched to all supported branches.
Michael Paquier [Fri, 15 Feb 2019 08:12:31 +0000 (17:12 +0900)]
Fix support for CREATE TABLE IF NOT EXISTS AS EXECUTE
The grammar IF NOT EXISTS for CTAS is supported since 9.5 and documented
as such, however the case of using EXECUTE as query has never been
covered as EXECUTE CTAS statements and normal CTAS statements are parsed
separately.
Author: Andreas Karlsson
Discussion: https://postgr.es/m/2ddcc188-e37c-a0be-32bf-a56b07c3559e@proxel.se
Backpatch-through: 9.5
Thomas Munro [Thu, 14 Feb 2019 21:19:11 +0000 (10:19 +1300)]
Fix race in dsm_attach() when handles are reused.
DSM handle values can be reused as soon as the underlying shared memory
object has been destroyed. That means that for a brief moment we
might have two DSM slots with the same handle. While trying to attach,
if we encounter a slot with refcnt == 1, meaning that it is currently
being destroyed, we should continue our search in case the same handle
exists in another slot.
The race manifested as a rare "dsa_area could not attach to segment"
error, and was more likely in 10 and 11 due to the lack of distinct
seed for random() in parallel workers. It was made very unlikely in
in master by commit 197e4af9, and older releases don't usually create
new DSM segments in background workers so it was also unlikely there.
This fixes the root cause of bug report #15585, in which the error
could also sometimes result in a self-deadlock in the error path.
It's not yet clear if further changes are needed to avoid that failure
mode.
Back-patch to 9.4, where dsm.c arrived.
Author: Thomas Munro Reported-by: Justin Pryzby, Sergei Kornilov
Discussion: https://postgr.es/m/20190207014719.GJ29720@telsasoft.com
Discussion: https://postgr.es/m/15585-324ff6a93a18da46@postgresql.org