Tom Lane [Fri, 23 Jun 2017 20:00:45 +0000 (16:00 -0400)]
Fix incorrect buffer-length argument to uloc_getDisplayName().
The maxResultSize argument of uloc_getDisplayName is the number of
UChars in the output buffer, not the number of bytes. In principle
this could result in a stack smash, although at least in my Fedora 25
install there are no ICU locales with display names long enough to
overrun the buffer. But it's easily proven to be wrong by reducing
the length of displayname to around 20, whereupon a stack smash
does happen.
(This is a rather scary bug, because the same mistake could easily
have been made in other places; but in a quick code search looking
at uses of UChar I could not find any other instances.)
Peter Eisentraut [Fri, 23 Jun 2017 19:12:36 +0000 (15:12 -0400)]
Fix replication with replica identity full
The comparison with the target rows on the subscriber side was done with
datumIsEqual(), which can have false negatives. For instance, it didn't
work reliably for text columns. So use the equality operator provided
by the type cache instead.
Also add more user documentation about replica identity requirements.
Tom Lane [Fri, 23 Jun 2017 18:19:48 +0000 (14:19 -0400)]
Rethink behavior of pg_import_system_collations().
Marco Atzeri reported that initdb would fail if "locale -a" reported
the same locale name more than once. All previous versions of Postgres
implicitly de-duplicated the results of "locale -a", but the rewrite
to move the collation import logic into C had lost that property.
It had also lost the property that locale names matching built-in
collation names were silently ignored.
The simplest way to fix this is to make initdb run the function in
if-not-exists mode, which means that there's no real use-case for
non if-not-exists mode; we might as well just drop the boolean argument
and simplify the function's definition to be "add any collations not
already known". This change also gets rid of some odd corner cases
caused by the fact that aliases were added in if-not-exists mode even
if the function argument said otherwise.
While at it, adjust the behavior so that pg_import_system_collations()
doesn't spew "collation foo already exists, skipping" messages during a
re-run; that's completely unhelpful, especially since there are often
hundreds of them. And make it return a count of the number of collations
it did add, which seems like it might be helpful.
Also, re-integrate the previous coding's property that it would make a
deterministic selection of which alias to use if there were conflicting
possibilities. This would only come into play if "locale -a" reports
multiple equivalent locale names, say "de_DE.utf8" and "de_DE.UTF-8",
but that hardly seems out of the question.
In passing, fix incorrect behavior in pg_import_system_collations()'s
ICU code path: it neglected CommandCounterIncrement, which would result
in failures if ICU returns duplicate names, and it would try to create
comments even if a new collation hadn't been created.
Also, reorder operations in initdb so that the 'ucs_basic' collation
is created before calling pg_import_system_collations() not after.
This prevents a failure if "locale -a" were to report a locale named
that. There's no reason to think that that ever happens in the wild,
but the old coding would have survived it, so let's be equally robust.
Simon Riggs [Fri, 23 Jun 2017 17:58:46 +0000 (18:58 +0100)]
Improve replication lag interpolation after idle period
After sitting idle and fully replayed for a while and then encountering
a new burst of WAL activity, we interpolate between an ancient sample and the
not-yet-reached one for the new traffic. That produced a corner case report
of lag after receiving first new reply from standby, which might sometimes
be a large spike.
Correct this by resetting last_read time and handle that new case.
Tom Lane [Fri, 23 Jun 2017 16:22:06 +0000 (12:22 -0400)]
Fix memory leakage in ICU encoding conversion, and other code review.
Callers of icu_to_uchar() neglected to pfree the result string when done
with it. This results in catastrophic memory leaks in varstr_cmp(),
because of our prevailing assumption that btree comparison functions don't
leak memory. For safety, make all the call sites clean up leaks, though
I suspect that we could get away without it in formatting.c. I audited
callers of icu_from_uchar() as well, but found no places that seemed to
have a comparable issue.
Add function API specifications for icu_to_uchar() and icu_from_uchar();
the lack of any thought-through specification is perhaps not unrelated
to the existence of this bug in the first place. Fix icu_to_uchar()
to guarantee a nul-terminated result; although no existing caller appears
to care, the fact that it would have been nul-terminated except in
extreme corner cases seems ideally designed to bite someone on the rear
someday. Fix ucnv_fromUChars() destCapacity argument --- in the worst
case, that could perhaps have led to a non-nul-terminated result, too.
Fix icu_from_uchar() to have a more reasonable definition of the function
result --- no callers are actually paying attention, so this isn't a live
bug, but it's certainly sloppily designed. Const-ify icu_from_uchar()'s
input string for consistency.
That is not the end of what needs to be done to these functions, but
it's as much as I have the patience for right now.
Tom Lane [Fri, 23 Jun 2017 15:03:04 +0000 (11:03 -0400)]
Add testing to detect errors of omission in "pin" dependency creation.
It's essential that initdb.c's setup_depend() scan each system catalog
that could contain objects that need to have "p" (pin) entries in pg_depend
or pg_shdepend. Forgetting to add that, either when a catalog is first
invented or when it first acquires DATA() entries, is an obvious bug
hazard. We can detect such omissions at reasonable cost by probing every
OID-containing system catalog to see whether the lowest-numbered OID in it
is pinned. If so, the catalog must have been properly accounted for in
setup_depend(). If the lowest OID is above FirstNormalObjectId then the
catalog must have been empty at the end of initdb, so it doesn't matter.
There are a small number of catalogs whose first entry is made later in
initdb than setup_depend(), resulting in nonempty expected output of the
test, but these can be manually inspected to see that they are OK. Any
future mistake of this ilk will manifest as a new entry in the test's
output.
Since pg_conversion is already in the test's output, add it to the set of
catalogs scanned by setup_depend(). That has no effect today (hence, no
catversion bump here) but it will protect us if we ever do add pin-worthy
conversions.
This test is very much like the catalog sanity checks embodied in
opr_sanity.sql and type_sanity.sql, but testing pg_depend doesn't seem to
fit naturally into either of those scripts' charters. Hence, invent a new
test script misc_sanity.sql, which can be a home for this as well as tests
on any other catalogs we might want in future.
Alvaro Herrera [Thu, 22 Jun 2017 21:12:27 +0000 (17:12 -0400)]
Fix typos in README.dependencies
There was a logic error in a formula, reported by Atsushi Torokoshi.
Ashutosh Bapat furthermore recommended to change notation for a variable
that was re-using a letter from a previous formula, though his proposed
patch contained a small error in attributing what the new letter is for.
Also, instead of his proposed d' I ended up using e, to avoid confusing
the reader with quotes which are used differently in the explaining
prose.
Alvaro Herrera [Thu, 22 Jun 2017 17:50:26 +0000 (13:50 -0400)]
Fix autovacuum launcher attachment to its DSA
The autovacuum launcher doesn't actually do anything with its DSA other
than creating it and attaching to it, but it's been observed that after
longjmp'ing to the standard error handling block (for example after
getting SIGINT) the autovacuum enters an infinite loop reporting that it
cannot attach to its DSA anymore (which is correct, because it's already
attached to it.) Fix by only attempting to attach if not already
attached.
I introduced this bug together with BRIN autosummarization in 7526e10224f0.
Reported-by: Yugo Nagata.
Author: Thomas Munro. I added the comment to go with it.
Discussion: https://postgr.es/m/20170621211538.0c9eae73.nagata@sraoss.co.jp
Andres Freund [Wed, 21 Jun 2017 21:14:45 +0000 (14:14 -0700)]
Fix possibility of creating a "phantom" segment after promotion.
When promoting a standby just after a XLOG_SWITCH record was replayed,
and next segment(s) are already are locally available (via walsender,
restore_command + trigger/recovery target), that segment could
accidentally be recycled onto the past of the new timeline. Later
checkpointer would create a .ready file for it, assuming there was an
error during creation, and it would get archived. That causes trouble
if another standby is later brought up from a basebackup from before
the timeline creation, because it would try to read the
segment, because XLogFileReadAnyTLI just tries all possible timelines,
which doesn't have valid contents. Thus replay would fail.
The problem, if already occurred, can be fixed by removing the segment
and/or having restore_command filter it out.
The reason for the creation of such "phantom" segments was, that after
an XLOG_SWITCH record the EndOfLog variable points to the beginning of
the next segment, and RemoveXlogFile() used XLByteToPrevSeg().
Normally RemoveXlogFile() doing so is harmless, because the last
segment will still exist preventing InstallXLogFileSegment() from
causing harm, but just after promotion there's no previous segment on
the new timeline.
Fix that by using XLByteToSeg() instead of XLByteToPrevSeg().
Author: Andres Freund Reported-By: Greg Burek
Discussion: https://postgr.es/m/20170619073026.zcwpe6mydsaz5ygd@alap3.anarazel.de
Backpatch: 9.2-, bug older than all supported versions
Tom Lane [Wed, 21 Jun 2017 19:35:54 +0000 (15:35 -0400)]
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Tom Lane [Wed, 21 Jun 2017 19:18:54 +0000 (15:18 -0400)]
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Tom Lane [Wed, 21 Jun 2017 18:39:04 +0000 (14:39 -0400)]
Initial pgindent run with pg_bsd_indent version 2.0.
The new indent version includes numerous fixes thanks to Piotr Stefaniak.
The main changes visible in this commit are:
* Nicer formatting of function-pointer declarations.
* No longer unexpectedly removes spaces in expressions using casts,
sizeof, or offsetof.
* No longer wants to add a space in "struct structname *varname", as
well as some similar cases for const- or volatile-qualified pointers.
* Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely.
* Fixes bug where comments following declarations were sometimes placed
with no space separating them from the code.
* Fixes some odd decisions for comments following case labels.
* Fixes some cases where comments following code were indented to less
than the expected column 33.
On the less good side, it now tends to put more whitespace around typedef
names that are not listed in typedefs.list. This might encourage us to
put more effort into typedef name collection; it's not really a bug in
indent itself.
There are more changes coming after this round, having to do with comment
indentation and alignment of lines appearing within parentheses. I wanted
to limit the size of the diffs to something that could be reviewed without
one's eyes completely glazing over, so it seemed better to split up the
changes as much as practical.
Tom Lane [Wed, 21 Jun 2017 18:26:21 +0000 (14:26 -0400)]
Adjust pgindent script to use pg_bsd_indent 2.0.
Update version-checking code and list of switches. Delete obsolete
quasi-support for using GNU indent. Remove a lot of no-longer-needed
workarounds for bugs of the old version, and improve comments for
the hacks that remain. Update run_build() subroutine to fetch the
pg_bsd_indent code from the newly established git repo for it.
In passing, fix pgindent to not overwrite files that require no changes;
this makes it a bit more friendly to run on a built tree.
Adjust relevant documentation.
Remove indent.bsd.patch; it's not relevant anymore (and was obsolete
long ago anyway). Likewise remove pgcppindent, since we're no longer
in the business of shipping C++ code.
Piotr Stefaniak is responsible for most of the algorithmic changes
to the pgindent script; I did the rest.
Tom Lane [Tue, 20 Jun 2017 21:06:43 +0000 (17:06 -0400)]
Make opr_sanity test complain about built-in functions marked prosecdef.
Currently, there are no built-in functions that are SECURITY DEFINER.
But we just found an instance where one was mistakenly marked that way,
so it seems prudent to add a test about it. If we ever grow some
functions that are intentionally SECURITY DEFINER, we can alter the
expected output of this test, or adjust the query to filter out functions
for which it's okay.
Tom Lane [Tue, 20 Jun 2017 17:39:57 +0000 (13:39 -0400)]
Upgrade documentation connected with shared_preload_libraries et al.
Noplace in the documentation actually defined what these variables
contain. Define them as lists of arguments for LOAD, and improve
that command's documentation a bit.
Bruce Momjian [Tue, 20 Jun 2017 17:20:02 +0000 (13:20 -0400)]
pg_upgrade: start/stop new server after pg_resetwal
When commit 0f33a719fdbb5d8c43839ea0d2c90cd03e2af2d2 removed the
instructions to start/stop the new cluster before running rsync, it was
now possible for pg_resetwal/pg_resetxlog to leave the final WAL record
at wal_level=minimum, preventing upgraded standby servers from
reconnecting.
This patch fixes that by having pg_upgrade unconditionally start/stop
the new cluster after pg_resetwal/pg_resetxlog has run.
Backpatch through 9.2 since, though the instructions were added in PG
9.5, they worked all the way back to 9.2.
Tom Lane [Tue, 20 Jun 2017 17:02:42 +0000 (13:02 -0400)]
Don't downcase entries within shared_preload_libraries et al.
load_libraries(), which processes the various xxx_preload_libraries GUCs,
was parsing them using SplitIdentifierString() which isn't really
appropriate for values that could be path names: it downcases unquoted
text, and it doesn't allow embedded whitespace unless quoted.
Use SplitDirectoriesString() instead. That also allows us to simplify
load_libraries() a bit, since canonicalize_path() is now done for it.
While this definitely seems like a bug fix, it has the potential to
break configuration settings that accidentally worked before because
of the downcasing behavior. Also, there's an easy workaround for the
bug, namely to double-quote troublesome text. Hence, no back-patch.
Peter Eisentraut [Tue, 20 Jun 2017 16:25:07 +0000 (12:25 -0400)]
Tweak publication fetching in psql
Viewing a table with \d in psql also shows the publications at table is
in. If a publication is concurrently dropped, this shows an error,
because the view pg_publication_tables internally uses
pg_get_publication_tables(), which uses a catalog snapshot. This can be
particularly annoying if a for-all-tables publication is concurrently
dropped.
To avoid that, write the query in psql differently. Expose the function
pg_relation_is_publishable() to SQL and write the query using that.
That still has a risk of being affected by concurrent catalog changes,
but in this case it would be a table drop that causes problems, and then
the psql \d command wouldn't be interesting anymore anyway.
Tom Lane [Mon, 19 Jun 2017 22:32:22 +0000 (18:32 -0400)]
Fix materialized-view documentation oversights.
When materialized views were added, psql's \d commands were made to
treat them as a separate object category ... but not everyplace in the
documentation or comments got the memo.
Noted by David Johnston. Back-patch to 9.3 where matviews came in.
Tom Lane [Mon, 19 Jun 2017 19:33:41 +0000 (15:33 -0400)]
Avoid regressions in foreign-key-based selectivity estimates.
David Rowley found that the "use the smallest per-column selectivity"
heuristic applied in some cases by get_foreign_key_join_selectivity()
was badly off if the FK columns are independent, producing estimates
much worse than we got before that code was added in 9.6.
One case where that heuristic was used was for LEFT and FULL outer joins
with the referenced rel on the outside of the join. But we should not
really need to special-case those here. eqjoinsel() never has had such a
special case; the correction is applied by calc_joinrel_size_estimate()
instead. Let's just estimate such cases like inner joins and rely on that
later adjustment. (I think there was something of a thinko here, in that
the comments seem to be thinking about the selectivity as defined for
semi/anti joins; but that shouldn't apply to left/full joins.) Add a
regression test exercising such a case to show that this is sane in
at least some cases.
The other case where we used that heuristic was for SEMI/ANTI outer joins,
either if the referenced rel was on the outside, or if it was on the inside
but was part of a join within the RHS. In either case, the FK doesn't give
us a lot of traction towards estimating the selectivity. To ensure that
we don't have regressions from what happened before 9.6, let's punt by
ignoring the FK in such cases and applying the traditional selectivity
calculation. (We might be able to improve on that later, but for now
I just want to be sure it's not worse than 9.5.)
Report and patch by David Rowley, simplified a bit by me. Back-patch
to 9.6 where this code was added.
Tom Lane [Mon, 19 Jun 2017 15:02:45 +0000 (11:02 -0400)]
On Windows, make pg_dump use binary mode for compressed plain text output.
The combination of -Z -Fp and output to stdout resulted in corrupted
output data, because we left stdout in text mode, resulting in newline
conversion being done on the compressed stream. Switch stdout to binary
mode for this case, at the same place where we do it for non-text output
formats.
Report and patch by Kuntal Ghosh, tested by Ashutosh Sharma and Neha
Sharma. Back-patch to all supported branches.
Andres Freund [Mon, 19 Jun 2017 01:48:22 +0000 (18:48 -0700)]
Fix leaking of small spilled subtransactions during logical decoding.
When, during logical decoding, a transaction gets too big, it's
contents get spilled to disk. Not just the top-transaction gets
spilled, but *also* all of its subtransactions, even if they're not
that large themselves. Unfortunately we didn't clean up
such small spilled subtransactions from disk.
Fix that, by keeping better track of whether a transaction has been
spilled to disk.
Author: Andres Freund Reported-By: Dmitriy Sarafannikov, Fabrízio de Royes Mello
Discussion:
https://postgr.es/m/1457621358.355011041@f382.i.mail.ru
https://postgr.es/m/CAFcNs+qNMhNYii4nxpO6gqsndiyxNDYV0S=JNq0v_sEE+9PHXg@mail.gmail.com
Backpatch: 9.4-, where logical decoding was introduced
Peter Eisentraut [Sat, 17 Jun 2017 01:23:22 +0000 (21:23 -0400)]
Define HAVE_UCOL_STRCOLLUTF8 on Windows
This should normally be determined by a configure check, but until
someone figures out how to do that on Windows, it's better that the code
uses the new function by default.
Tom Lane [Sat, 17 Jun 2017 03:14:27 +0000 (23:14 -0400)]
Teach pgindent to skip files generated by bison or flex automatically.
If a .c or .h file corresponds to a .y or .l file, skip indenting it.
There's no point in reindenting derived files, and these files tend to
confuse pgindent. (Which probably indicates a bug in BSD indent, but
I can't get excited about trying to fix it.)
For the same reasons, add src/backend/utils/fmgrtab.c to the set of
files excluded by src/tools/pgindent/exclude_file_patterns.
The point of doing this is that it makes it safe to run pgindent over
the tree without doing "make maintainer-clean" first. While these are
not the only derived .c/.h files in the tree, they are the only ones
pgindent fails on. Removing that prerequisite step results in one less
way to mess up a pgindent run, and it's necessary if we ever hope to get
to the ease of running pgindent via "make indent".
Fix dependency, when changing a function's argument/return type.
When a new base type is created using the old-style procedure of first
creating the input/output functions with "opaque" in place of the base
type, the "opaque" argument/return type is changed to the final base type,
on CREATE TYPE. However, we did not create a pg_depend record when doing
that, so the functions were left not depending on the type.
Noah Misch [Fri, 16 Jun 2017 07:16:11 +0000 (00:16 -0700)]
Reconcile nodes/*funcs.c with PostgreSQL 10 work.
The _equalTableFunc() omission of coltypmods has semantic significance,
but I did not track down resulting user-visible bugs, if any. The other
changes are cosmetic only, affecting order. catversion bump due to
readfuncs.c field order change.
Tom Lane [Thu, 15 Jun 2017 19:56:12 +0000 (15:56 -0400)]
Make configure check for IPC::Run when --enable-tap-tests is specified.
The TAP tests mostly don't work without IPC::Run, and the reason for
the failure is not immediately obvious from the error messages you get.
So teach configure to reject --enable-tap-tests unless IPC::Run exists.
Mostly this just involves adding ax_prog_perl_modules.m4 from the GNU
autoconf archives.
This was discussed last year, but we held off on the theory that we might
be switching to CMake soon. That's evidently not happening for v10,
so let's absorb this now.
Tom Lane [Thu, 15 Jun 2017 19:03:39 +0000 (15:03 -0400)]
Fix low-probability leaks of PGresult objects in the backend.
We had three occurrences of essentially the same coding pattern
wherein we tried to retrieve a query result from a libpq connection
without blocking. In the case where PQconsumeInput failed (typically
indicating a lost connection), all three loops simply gave up and
returned, forgetting to clear any previously-collected PGresult
object. Since those are malloc'd not palloc'd, the oversight results
in a process-lifespan memory leak.
One instance, in libpqwalreceiver, is of little significance because
the walreceiver process would just quit anyway if its connection fails.
But we might as well fix it.
The other two instances, in postgres_fdw, are somewhat more worrisome
because at least in principle the scenario could be repeated, allowing
the amount of memory leaked to build up to something worth worrying
about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS
calls, as well as other calls that could potentially elog(ERROR),
providing another way to exit without having cleared the PGresult.
Here we need to add PG_TRY logic similar to what exists in quite a
few other places in postgres_fdw.
Coverity noted the libpqwalreceiver bug; I found the other two cases
by checking all calls of PQconsumeInput.
Back-patch to all supported versions as appropriate (9.2 lacks
postgres_fdw, so this is really quite unexciting for that branch).
Bruce Momjian [Thu, 15 Jun 2017 16:30:02 +0000 (12:30 -0400)]
docs: Fix pg_upgrade standby server upgrade docs
It was unsafe to instruct users to start/stop the server after
pg_upgrade was run but before the standby servers were rsync'ed. The
new instructions avoid this.
RELEASE NOTES: This fix should be mentioned in the minor release notes.
Reported-by: Dmitriy Sarafannikov and Sergey Burladyan
Discussion: https://postgr.es/m/87wp8o506b.fsf@seb.koffice.internal
Backpatch-through: 9.5, where standby server upgrade instructions first appeared
Tatsuo Ishii [Thu, 15 Jun 2017 01:01:39 +0000 (10:01 +0900)]
Fix document bug regarding read only transactions.
It was explained that read only transactions (not in standby) allow to
update sequences. This had been wrong since the commit: 05d8a561ff85db1545f5768fe8d8dc9d99ad2ef7
Robert Haas [Wed, 14 Jun 2017 20:19:46 +0000 (16:19 -0400)]
Fix problems related to RangeTblEntry members enrname and enrtuples.
Commit 18ce3a4ab22d2984f8540ab480979c851dae5338 failed to update
the comments in parsenodes.h for the new members, and made only
incomplete updates to src/backend/nodes
Andres Freund [Wed, 14 Jun 2017 18:57:21 +0000 (11:57 -0700)]
Don't force-assign transaction id when exporting a snapshot.
Previously we required every exported transaction to have an xid
assigned. That was used to check that the exporting transaction is
still running, which in turn is needed to guarantee that that
necessary rows haven't been removed in between exporting and importing
the snapshot.
The exported xid caused unnecessary problems with logical decoding,
because slot creation has to wait for all concurrent xid to finish,
which in turn serializes concurrent slot creation. It also
prohibited snapshots to be exported on hot-standby replicas.
Instead export the virtual transactionid, which avoids the unnecessary
serialization and the inability to export snapshots on standbys. This
changes the file name of the exported snapshot, but since we never
documented what that one means, that seems ok.
Author: Petr Jelinek, slightly editorialized by me Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f598b4b8-8cd7-0d54-0939-adda763d8c34@2ndquadrant.com
Robert Haas [Wed, 14 Jun 2017 17:13:11 +0000 (13:13 -0400)]
Teach predtest.c about CHECK clauses to fix partitioning bugs.
In a CHECK clause, a null result means true, whereas in a WHERE clause
it means false. predtest.c provided different functions depending on
which set of semantics applied to the predicate being proved, but had
no option to control what a null meant in the clauses provided as
axioms. Add one.
Use that in the partitioning code when figuring out whether the
validation scan on a new partition can be skipped. Rip out the
old logic that attempted (not very successfully) to compensate
for the absence of the necessary support in predtest.c.
Ashutosh Bapat and Robert Haas, reviewed by Amit Langote and
incorporating feedback from Tom Lane.
Alvaro Herrera [Wed, 14 Jun 2017 15:29:05 +0000 (11:29 -0400)]
Avoid bogus TwoPhaseState locking sequences
The optimized code in 728bd991c3c4 contains a few invalid locking
sequences. To wit, the original code would try to acquire an lwlock
that it already holds. Avoid this by moving lock acquisitions to
higher-level code, and install appropriate assertions in low-level that
the correct mode is held.
Authors: Michael Paquier, Álvaro Herrera Reported-By: chuanting wang
Bug: #14680
Discussion: https://postgr.es/m/20170531033228.1487.10124@wrigleys.postgresql.org
Tom Lane [Wed, 14 Jun 2017 15:10:05 +0000 (11:10 -0400)]
Fix no-longer-valid shortcuts in expression_returns_set().
expression_returns_set() used to short-circuit its recursion upon
seeing certain node types, such as DistinctExpr, that it knew the
executor did not support set-valued arguments for. That was never
inherent, though, just a reflection of laziness in execQual.c.
With the new implementation of SRFs there is no reason to think
that any scalar-valued expression node could not have a set-valued
subexpression, except for AggRefs and WindowFuncs where we know there
is a parser check rejecting it. And indeed, the shortcut causes
unexpected failures for cases such as a SRF underneath DistinctExpr,
because the planner stops looking for SRFs too soon.
Tom Lane [Wed, 14 Jun 2017 14:26:46 +0000 (10:26 -0400)]
Fix violations of CatalogTupleInsert/Update/Delete abstraction.
In commits 2f5c9d9c9 and ab0289651 we invented an abstraction layer
to insulate catalog manipulations from direct heap update calls.
But evidently some patches that hadn't landed in-tree at that point
didn't get the memo completely. Fix a couple of direct calls to
simple_heap_delete to use CatalogTupleDelete instead; these appear
to have been added in commits 7c4f52409 and 7b504eb28. This change is
purely cosmetic ATM, but there's no point in having an abstraction layer
if we allow random code to break it.
Dean Rasheed [Wed, 14 Jun 2017 08:00:01 +0000 (09:00 +0100)]
Teach PL/pgSQL about partitioned tables.
Table partitioning, introduced in commit f0e44751d7, added a new
relkind - RELKIND_PARTITIONED_TABLE. Update a couple of places in
PL/pgSQL to handle it. Specifically plpgsql_parse_cwordtype() and
build_row_from_class() needed updating in order to make table%ROWTYPE
and table.col%TYPE work for partitioned tables.
Dean Rasheed [Wed, 14 Jun 2017 07:43:40 +0000 (08:43 +0100)]
Teach RemoveRoleFromObjectPolicy() about partitioned tables.
Table partitioning, introduced in commit f0e44751d7, added a new
relkind - RELKIND_PARTITIONED_TABLE. Update
RemoveRoleFromObjectPolicy() to handle it, otherwise DROP OWNED BY
will fail if the role has any RLS policies referring to partitioned
tables.
Tom Lane [Wed, 14 Jun 2017 03:46:39 +0000 (23:46 -0400)]
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Peter Eisentraut [Tue, 13 Jun 2017 20:10:11 +0000 (16:10 -0400)]
doc: Update example version numbers in pg_upgrade documentation
The exact numbers don't matter, since they are examples, but it was
looking quite dated.
For the target version, we now automatically substitute the current
major version. The updated example source version should be good for a
couple of years.
Tom Lane [Tue, 13 Jun 2017 17:05:59 +0000 (13:05 -0400)]
Re-run pgindent.
This is just to have a clean base state for testing of Piotr Stefaniak's
latest version of FreeBSD indent. I fixed up a couple of places where
pgindent would have changed format not-nicely. perltidy not included.
This doesn't actually matter at present, because the current code
never consults null_index for range partitions. However, leaving
it uninitialized is still a bad idea, so let's not do that.
Dean Rasheed [Tue, 13 Jun 2017 16:30:36 +0000 (17:30 +0100)]
Teach relation_is_updatable() about partitioned tables.
Table partitioning, introduced in commit f0e44751d7, added a new
relkind - RELKIND_PARTITIONED_TABLE. Update relation_is_updatable() to
handle it. Specifically, partitioned tables and simple views built on
top of them are updatable.
This affects the SQL-callable functions pg_relation_is_updatable() and
pg_column_is_updatable(), and the views information_schema.views and
information_schema.columns.
Tom Lane [Tue, 13 Jun 2017 14:54:39 +0000 (10:54 -0400)]
In initdb, defend against assignment of NULL values to not-null columns.
Previously, you could write _null_ in a BKI DATA line for a column that's
supposed to be NOT NULL and initdb would let it pass, probably breaking
subsequent accesses to the row. No doubt the original coding overlooked
this simple sanity check because in the beginning we didn't have any way
to mark catalog columns NOT NULL at initdb time.
Tom Lane [Tue, 13 Jun 2017 03:29:44 +0000 (23:29 -0400)]
Fix confusion about number of subplans in partitioned INSERT setup.
ExecInitModifyTable() thought there was a plan per partition, but no,
there's only one. The problem had escaped detection so far because there
would only be visible misbehavior if there were a SubPlan (not an InitPlan)
in the quals being duplicated for each partition. However, valgrind
detected a bogus memory access in test cases added by commit 4f7a95be2,
and investigation of that led to discovery of the bug. The additional
test case added here crashes without the patch.