granicus.if.org Git - postgresql/log

Put back parallel-safety guards in plpython and src/test/regress/.

I'd hoped that commit 3b8f6e75f was sufficient to ensure parallel safety
even when a build started in a subdirectory requires rebuilding of
generated headers.  This isn't so, because making submake-generated-headers
a prerequisite of "all" isn't enough to ensure it's completed before
starting on "all"'s other prerequisites.  The explicit dependencies we put
on the recursive make targets ensure safe ordering before we recurse into
child directories, but they don't protect targets to be made in the current
directory.  Hence, put back some ordering dependencies in directories that
we've traditionally expected to be starting points for "standalone" builds,
to wit src/pl/plpython and src/test/regress.  (The former needs this in
order to minimize the work involved in building for both python 2 and
python 3; the latter to support packagings that make the regression tests
available for out-of-build-tree execution.)  Adjust some other dependencies
so that these two cases work correctly even at high -j settings.

I'm not terribly happy with this partial solution, but I don't see a
way to do better without massive makefile restructuring, which we surely
aren't doing at this point in the development cycle.  In any case, it's
little if any worse than what we had in prior releases.

Discussion: https://postgr.es/m/1523353963.8169.26.camel@gunduz.org

Fix IndexOnlyScan counter for heap fetches in parallel mode

The HeapFetches counter was using a simple value in IndexOnlyScanState,
which fails to propagate values from parallel workers; so the counts are
wrong when IndexOnlyScan runs in parallel. Move it to Instrumentation,
like all the other counters.

While at it, change INSERT ON CONFLICT conflicting tuple counter to use
the new ntuples2 instead of nfiltered2, which is a blatant misuse.

Discussion: https://postgr.es/m/20180409215851.idwc75ct2bzi6tea@alvherre.pgsql

Fix pgxs.mk to not try to build generated headers in external builds.

Per Julien Rouhaud and the buildfarm. This is not quite Julien's
patch: there's no need to lobotomize this build rule when building
contrib modules in-tree, so set NO_GENERATED_HEADERS only if PGXS.

In passing, also set NO_TEMP_INSTALL in external builds. This doesn't
seem to be fixing any live bug, because "make check" in an external
build just produces the expected error message without first trying to
make a temp install ... but it's far from obvious why it doesn't, so
this change seems like good future-proofing.

Julien Rouhaud and Tom Lane

Discussion: https://postgr.es/m/CAOBaU_YH=g68opbbMk8is3jNwhoXGa8ckRSre1nx0Obe1C7i-Q@mail.gmail.com

Fix comment on B-tree insertion fastpath condition.

The comment earlier in the function correctly states "and the insertion
key is strictly greater than the first key in this page". That is what
we check here, not "greater than or equal".

Fix partial-build problems introduced by having more generated headers.

Commit 372728b0d created some problems for usages like building a
subdirectory without having first done "make all" at the top level,
or for proceeding directly to "make install" without "make all".
The only reasonably clean way to fix this seems to be to force the
submake-generated-headers rule to fire in *any* "make all" or "make
install" command anywhere in the tree. To avoid lots of redundant work,
as well as parallel make jobs possibly clobbering each others' output, we
still need to be sure that the rule fires only once in a recursive build.
For that, adopt the same MAKELEVEL hack previously used for "temp-install".
But try to document it a bit better.

The submake-errcodes mechanism previously used in src/port/ and src/common/
is subsumed by this, so we can get rid of those special cases. It was
inadequate for src/common/ anyway after the aforesaid commit, and it always
risked parallel attempts to build errcodes.h.

Discussion: https://postgr.es/m/E1f5FAB-0006LU-MB@gemulon.postgresql.org

Fix incorrect logic for choosing the next Parallel Append subplan

In 499be013de support for pruning unneeded Append subnodes was added.
The logic in that commit was not correctly checking if the next subplan
was in fact a valid subplan. This could cause parallel workers processes
to be given a subplan to work on which didn't require any work.

Per code review following an otherwise unexplained regression failure in
buildfarm member Pademelon. (We haven't been able to reproduce the
failure, so this is a bit of a blind fix in terms of whether it'll
actually fix it; but it is a clear bug nonetheless).

In passing, also add a comment to explain what first_partial_plan means.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f_E5r05hHUVG3UmCQJ49DGKKHtN=SHybD44LdzBn+CJng@mail.gmail.com

Silence some warnings in TAP tests

Author: Michael Paquier

Make sure pg_rewind can't run as root

Previously a warning was printed, but the tool actually kept running
even when running as root. This is something we definitely want to
prevent, but since this means a behavior change, not backpatching.

Author: Michael Paquier

Reduce chattiness of genbki.pl and Gen_fmgrtab.pl.

Make these scripts emit just one log message when they run, not one
per output file. The latter is way too verbose in the wake of
commit 372728b0d. The specific wording used is what already existed
in the MSVC scripts.

John Naylor

Discussion: https://postgr.es/m/11103.1523208822@sss.pgh.pa.us

Make reformat_dat_file.pl preserve all blank lines.

In its original form, reformat_dat_file.pl smashed consecutive blank
lines to a single blank line, which was helpful for mopping up excess
whitespace during the bootstrap data format conversion. But going
forward, there seems little reason to do that; if developers want to
put in multiple blank lines, let 'em. This makes it conform to the
documentation I (tgl) wrote, too.

In passing, clean up some sloppy markup choices in bki.sgml.

John Naylor

Discussion: https://postgr.es/m/28827.1523039259@sss.pgh.pa.us

Further cleanup of client dependencies on src/include/catalog headers.

In commit 9c0a0de4c, I'd failed to notice that catalog/catalog.h
should also be considered a frontend-unsafe header, because it includes
(and needs) the full form of pg_class.h, not to mention relcache.h.
However, various frontend code was depending on it to get
TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for.

The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY,
as well as the OIDCHARS symbol, to common/relpath.h.  Do that, and mop up
inclusions as necessary.  (I found that quite a few current users of
catalog/catalog.h don't seem to need it at all anymore, apparently as a
result of the refactorings that created common/relpath.[hc].  And
initdb.c needed it only as a route to pg_class_d.h.)

Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us

catversion bump for online-checksums revert

Lack thereof pointed out by Tom Lane.

Revert "Allow on-line enabling and disabling of data checksums"

This reverts the backend sides of commit 1fde38beaa0c3e66c340efc7cc0dc272d6254bb0.
I have, at least for now, left the pg_verify_checksums tool in place, as
this tool can be very valuable without the rest of the patch as well,
and since it's a read-only tool that only runs when the cluster is down
it should be a lot safer.

Improve covering index documentation

Add missed description of pg_constraint.conincluding

Shinoda, Noriyoshi and Alexander Korotkov

Minor comment updates

Fix a couple of typos, and update a comment about why we set a BMS to
NULL.

Author: David Rowley
Discussion: http://postgr.es/m/CAKJS1f-tux=KdUz6ENJ9GHM_V2qgxysadYiOyQS9Ko9PTteVhQ@mail.gmail.com

Add missed bms_copy() in perform_pruning_combine_step

We were initializing a BMS to merely reference an existing one, which
would cause a double-free (and a crash) when the recursive algorithm
tried to intersect it with an empty one. Fix it by creating a copy at
initialization time.

Reported-by: sqlsmith (by way of Andreas Seltenreich)
Author: Amit Langote
Discussion: https://postgr.es/m/87in923lyw.fsf@ansel.ydns.eu

Fix typo in comment.

Author: Kyotaro Horiguchi

Remove repeated test in contrib/amcheck

Repeating these tests adds unnecessary cycles, since no improvement in
test coverage is expected.

Cleanup from commit 8224de4f42ccf98e08db07b43d52fed72f962ebb.

Peter Geoghegan

Skip permissions test under MINGW/Windows

We don't support the same kind of permissions tests on Windows/MINGW, so
these tests really shouldn't be getting run on that platform.

Per buildfarm.

Fix additional breakage in covering-index patch.

CheckIndexCompatible() misused ComputeIndexAttrs() by not bothering
to fill ii_NumIndexAttrs and ii_NumIndexKeyAttrs in the passed
IndexInfo. Omission of ii_NumIndexAttrs was previously unimportant,
but now this matters because ComputeIndexAttrs depends on
ii_NumIndexKeyAttrs to decide how many columns it needs to report on.

(BTW, the fact that this oversight wasn't detected earlier implies
that we have no regression test verifying whether CheckIndexCompatible
ever succeeds. Bad dog. Not the job of this patch to fix it, though.)

Also, change the API of ComputeIndexAttrs so that it fills the opclass
output array for all column positions, as it does for the options output
array; positions for non-key index columns are filled with zeroes.
This isn't directly fixing any bug, but it seems like a good idea.

Per valgrind failure reports from buildfarm.

Alexander Korotkov, tweaked a bit by me

Discussion: https://postgr.es/m/CAPpHfduWrysrT-qAhn+3Ea5+Mg6Vhc-oA6o2Z-hRCPRdvf3tiw@mail.gmail.com

Doc: clarify explanation of pg_dump usage.

This section confusingly used both "infile" and "outfile" to refer
to the same file, i.e. the textual output of pg_dump. Use "dumpfile"
for both cases, per suggestion from Jonathan Katz.

Discussion: https://postgr.es/m/152311295239.31235.6487236091906987117@wrigleys.postgresql.org

Cosmetic cleanups in initial catalog data.

Write ',' and ';' for typdelim values instead of the obscurantist
ASCII octal equivalents. Not sure why anybody ever thought the
latter were better; maybe it had something to do with lack of
a better quoting convention, twenty-plus years ago?

Reassign a couple of high-numbered OIDs that were left in during
yesterday's mad rush to commit stuff of uncertain internal
temperature.

The latter requires a catversion bump, though the former wouldn't
since the end-result catalog data is unchanged.

Reduce worst-case shell command line length during "make install".

Addition of the catalog/pg_foo_d.h headers seems to have pushed us over
the brink of the maximum command line length for some older platforms
during "make install" for our header files. The main culprit here is
repetition of the target directory path, which could be long.
Rearrange so that we don't repeat that once per file, but only once
per subdirectory.

Per buildfarm.

Discussion: https://postgr.es/m/E1f5Dwm-0004n5-7O@gemulon.postgresql.org

Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

Traditionally, include/catalog/pg_foo.h contains extern declarations
for functions in backend/catalog/pg_foo.c, in addition to its function
as the authoritative definition of the pg_foo catalog's rowtype.
In some cases, we'd been forced to split out those extern declarations
into separate pg_foo_fn.h headers so that the catalog definitions
could be #include'd by frontend code. That problem is gone as of
commit 9c0a0de4c, so let's undo the splits to make things less
confusing.

Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us

Switch client-side code to include catalog/pg_foo_d.h not pg_foo.h.

Everything of use to frontend code should now appear in the _d.h files,
and making this change frees us from needing to worry about whether the
catalog header files proper are frontend-safe.

Remove src/interfaces/ecpg/ecpglib/pg_type.h entirely, as the previous
commit reduced it to a confusingly-named wrapper around pg_type_d.h.

In passing, make test_rls_hooks.c follow project convention of including
our own files with #include "" not <>.

Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us

Replace our traditional initial-catalog-data format with a better design.

Historically, the initial catalog data to be installed during bootstrap
has been written in DATA() lines in the catalog header files.  This had
lots of disadvantages: the format was badly underdocumented, it was
very difficult to edit the data in any mechanized way, and due to the
lack of any abstraction the data was verbose, hard to read/understand,
and easy to get wrong.

Hence, move this data into separate ".dat" files and represent it in a way
that can easily be read and rewritten by Perl scripts.  The new format is
essentially "key => value" for each column; while it's a bit repetitive,
explicit labeling of each value makes the data far more readable and less
error-prone.  Provide a way to abbreviate entries by omitting field values
that match a specified default value for their column.  This allows removal
of a large amount of repetitive boilerplate and also lowers the barrier to
adding new columns.

Also teach genbki.pl how to translate symbolic OID references into
numeric OIDs for more cases than just "regproc"-like pg_proc references.
It can now do that for regprocedure-like references (thus solving the
problem that regproc is ambiguous for overloaded functions), operators,
types, opfamilies, opclasses, and access methods.  Use this to turn
nearly all OID cross-references in the initial data into symbolic form.
This represents a very large step forward in readability and error
resistance of the initial catalog data.  It should also reduce the
difficulty of renumbering OID assignments in uncommitted patches.

Also, solve the longstanding problem that frontend code that would like to
use OID macros and other information from the catalog headers often had
difficulty with backend-only code in the headers.  To do this, arrange for
all generated macros, plus such other declarations as we deem fit, to be
placed in "derived" header files that are safe for frontend inclusion.
(Once clients migrate to using these pg_*_d.h headers, it will be possible
to get rid of the pg_*_fn.h headers, which only exist to quarantine code
away from clients.  That is left for follow-on patches, however.)

The now-automatically-generated macros include the Anum_xxx and Natts_xxx
constants that we used to have to update by hand when adding or removing
catalog columns.

Replace the former manual method of generating OID macros for pg_type
entries with an automatic method, ensuring that all built-in types have
OID macros.  (But note that this patch does not change the way that
OID macros for pg_proc entries are built and used.  It's not clear that
making that match the other catalogs would be worth extra code churn.)

Add SGML documentation explaining what the new data format is and how to
work with it.

Despite being a very large change in the catalog headers, there is no
catversion bump here, because postgres.bki and related output files
haven't changed at all.

John Naylor, based on ideas from various people; review and minor
additional coding by me; previous review by Alvaro Herrera

Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com

match_clause_to_index should check only key columns

Alexander Korotkov per gripe from Tom Lane noticed on valgrind-enabled
buildfarm members

Remove unused variable in non-assert-enabled build

Use field of structure in Assert directly

Jeff Janes

Add missing "static" markers.

Evidently forgotten in commit 11523e860. Per buildfarm member pademelon.

Attempt to stabilize partition_prune test output.

Disable index-only scan for tests that might report variable results
for "Heap Fetches" statistic due to concurrent transactions affecting
whether all-visible flags can be set.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f_yjtHDJnDzx1uuR_3D7beDVAkNQfWJhRLA1gvPCzkAhg@mail.gmail.com

Support index INCLUDE in the AM properties interface.

This rectifies an oversight in commit 8224de4f4, by adding a new
property 'can_include' for pg_indexam_has_property, and adjusting the
results of pg_index_column_has_property to give more appropriate
results for INCLUDEd columns.

Remove overzeleous assertions in pg_atomic_flag code.

The atomics code asserts proper alignment in various places. That's
mainly because the alignment of 64bit integers is not sufficient for
atomic operations on all platforms. Some ABIs only have four byte
alignment, but don't have atomic behavior when crossing page
boundaries.

The flags code isn't affected by that however, as the type alignment
always is sufficient for atomic operations. Nevertheless the code
asserted alignment requirements. Before 8c3debbb it was only broken on
hppa, after it probably affect further platforms.

Thus remove the assertions for pg_atomic_flag operators.

Per buildfarm animal pademelon.

Discussion: https://postgr.es/m/7223.1523124425@sss.pgh.pa.us
Backpatch: 9.5-

Fix EXEC BACKEND + Windows builds for group privs

Under EXEC BACKEND we also need to be going through the group privileges
setup since we do support that on Unixy systems, so add that to
SubPostmasterMain().

Under Windows, we need to simply return true from
GetDataDirectoryCreatePerm(), but that wasn't happening due to a missing
#else clause.

Per buildfarm.

Allow group access on PGDATA

Allow the cluster to be optionally init'd with read access for the
group.

This means a relatively non-privileged user can perform a backup of the
cluster without requiring write privileges, which enhances security.

The mode of PGDATA is used to determine whether group permissions are
enabled for directory and file creates.  This method was chosen as it's
simple and works well for the various utilities that write into PGDATA.

Changing the mode of PGDATA manually will not automatically change the
mode of all the files contained therein.  If the user would like to
enable group access on an existing cluster then changing the mode of all
the existing files will be required.  Note that pg_upgrade will
automatically change the mode of all migrated files if the new cluster
is init'd with the -g option.

Tests are included for the backend and all the utilities which operate
on the PG data directory to ensure that the correct mode is set based on
the data directory permissions.

Author: David Steele <david@pgmasters.net>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net

Refactor dir/file permissions

Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).

Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).

Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.

Authors: David Steele <david@pgmasters.net>,
Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net

Support partition pruning at execution time

Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query.  This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.

This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:

1. Parameterized Nested Loop Joins: The parameter from the outer side of the
   join can be used to determine the minimum set of inner side partitions to
   scan.

2. Initplans: Once an initplan has been executed we can then determine which
   partitions match the value from the initplan.

Partition pruning is performed in two ways.  When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor.  This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.

For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait.  Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output.  In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed.  If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)".  Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times.  This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.

This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable.  This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.

Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com

Add bms_prev_member function

This works very much like the existing bms_last_member function, only it
traverses through the Bitmapset in the opposite direction from the most
significant bit down to the least significant bit.  A special prevbit value of
-1 may be used to have the function determine the most significant bit.  This
is useful for starting a loop.  When there are no members less than prevbit,
the function returns -2 to indicate there are no more members.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f-K=3d5MDASNYFJpUpc20xcBnAwNC1-AOeunhn0OtkWbQ@mail.gmail.com

Raise error when affecting tuple moved into different partition.

When an update moves a row between partitions (supported since
2f178441044b), our normal logic for following update chains in READ
COMMITTED mode doesn't work anymore. Cross partition updates are
modeled as an delete from the old and insert into the new
partition. No ctid chain exists across partitions, and there's no
convenient space to introduce that link.

Not throwing an error in a partitioned context when one would have
been thrown without partitioning is obviously problematic. This commit
introduces infrastructure to detect when a tuple has been moved, not
just plainly deleted. That allows to throw an error when encountering
a deletion that's actually a move, while attempting to following a
ctid chain.

The row deleted as part of a cross partition update is marked by
pointing it's t_ctid to an invalid block, instead of self as a normal
update would. That was deemed to be the least invasive and most
future proof way to represent the knowledge, given how few infomask
bits are there to be recycled (there's also some locking issues with
using infomask bits).

External code following ctid chains should be updated to check for
moved tuples. The most likely consequence of not doing so is a missed
error.

Author: Amul Sul, editorialized by me
Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas
Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com

Indexes with INCLUDE columns and their support in B-tree

This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.

Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.

In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.

Bump catalog version

Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru

Make test of json(b)_to_tsvector language-independ

Missed in 1c1791e00065f6986f9d44a78ce7c28b2d1322dd commit

Add json(b)_to_tsvector function

Jsonb has a complex nature so there isn't best-for-everything way to convert it
to tsvector for full text search. Current to_tsvector(json(b)) suggests to
convert only string values, but it's possible to index keys, numerics and even
booleans value. To solve that json(b)_to_tsvector has a second required
argument contained a list of desired types of json fields. Second argument is
a jsonb scalar or array right now with possibility to add new options in a
future.

Bump catalog version

Author: Dmitry Dolgov with some editorization by me
Reviewed by: Teodor Sigaev
Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com

Fix timing issue in new subscription truncate test

We need to wait for the initial sync of all subscriptions. On
some (faster?) machines, this didn't make a difference, but
the (slower?) buildfarm machines are upset.

Deactive flapping checksum isolation tests.

They've been broken for days, and prevent other tests from being
run. The plan is to revert their addition later.

Discussion: https://postgr.es/m/20180407162252.wfo5aorjrjw2n5ws@alap3.anarazel.de

Logical replication support for TRUNCATE

Update the built-in logical replication system to make use of the
previously added logical decoding for TRUNCATE support. Add the
required truncate callback to pgoutput and a new logical replication
protocol message.

Publications get a new attribute to determine whether to replicate
truncate actions. When updating a publication via pg_dump from an older
version, this is not set, thus preserving the previous behavior.

Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>

Logical decoding of TRUNCATE

Add a new WAL record type for TRUNCATE, which is only used when
wal_level >= logical. (For physical replication, TRUNCATE is already
replicated via SMGR records.) Add new callback for logical decoding
output plugins to receive TRUNCATE actions.

Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>

Predicate locking in hash indexes.

Hash index searches acquire predicate locks on the primary
page of a bucket. It acquires a lock on both the old and new buckets
for scans that happen concurrently with page splits. During a bucket
split, a predicate lock is copied from the primary page of an old
bucket to the primary page of a new bucket.

Author: Shubham Barai, Amit Kapila
Reviewed by: Amit Kapila, Alexander Korotkov, Thomas Munro
Discussion: https://www.postgresql.org/message-id/flat/CALxAEPvNsM2GTiXdRgaaZ1Pjd1bs+sxfFsf7Ytr+iq+5JJoYXA@mail.gmail.com

Document partprune.c a little better

Author: Amit Langote
Reviewed-by: Álvaro Herrera, David Rowley
Discussion: https://postgr.es/m/CA+HiwqGzq4D6z=8R0AP+XhbTFCQ-4Ct+t2ekqjE9Fpm84_JUGg@mail.gmail.com

Blindly attempt to fix sepgsql tests broken due to 9fdb675fc5.

The failure appears to solely be caused by the changed partition
pruning logic.

Author: Andres Freund
Discussion: https://postgr.es/m/20180406210330.wmqw42wqgiicktli@alap3.anarazel.de

Attempt to fix endianess issues in new hash partition test.

The tests added as part of 9fdb675fc5 yield differing results
depending on endianess, causing buildfarm failures. As the differences
are expected, split the hash partitioning tests into a different file
and maintain alternative output. The separate file is so the amount of
duplicated output is reduced.

David produced the alternative output without a machine to test on, so
it's possible this'll require a buildfarm cycle or two to get right.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f-6f4c2Qhuipe-GY7BKmFd0FMBobRnLS7hVCoAmTszsBg@mail.gmail.com

Fix and improve pg_atomic_flag fallback implementation.

The atomics fallback implementation for pg_atomic_flag was broken,
returning the inverted value from pg_atomic_test_set_flag().  This was
unnoticed because a) atomic flags were unused until recently b) the
test code wasn't run when the fallback implementation was in
use (because it didn't allow to test for some edge cases).

Fix the bug, and improve the fallback so it has the same behaviour as
the non-fallback implementation in the problematic edge cases. That
breaks ABI compatibility in the back branches when fallbacks are in
use, but given they were broken until now...

Author: Andres Freund
Reported-by: Daniel Gustafsson
Discussion:
    https://postgr.es/m/FB948276-7B32-4B77-83E6-D00167F8EEB4@yesql.se
    https://postgr.es/m/20180406233854.uni2h3mbnveczl32@alap3.anarazel.de
Backpatch: 9.5-, where the atomics abstraction was introduced.

Doc: fix broken markup.

Commit 3d956d956 was apparently not checked against HEAD's doc toolchain.
Per buildfarm.

Fix possible failure in parallel index build.

Report and proposed fix by David Rowley, put in patch form by
Peter Geoghegan.

Discussion: http://postgr.es/m/CAKJS1f91kq1wfYR8rnRRfKtxyhU2woEA+=whd640UxMyU+O0EQ@mail.gmail.com

Allow insert and update tuple routing and COPY for foreign tables.

Also enable this for postgres_fdw.

Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
patch series of which this is a part has been reviewed by Amit
Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
and me. Minor documentation changes to the final version by me.

Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp

Remove some unnecessary quote marks from catalog DATA lines.

This has no functional impact whatsoever. However, it causes
these unnecessary quote marks to disappear from the generated
postgres.bki file, making it easier to verify that the upcoming
bootstrap data conversion patch doesn't change the generated file.

Fix badly edited doc sentence

Noted by Vik Fearing and Robert Haas

Clean up intermetiate state in pg_basebackup tests

These tests accummulated almost a gigabyte of data during the test which
was then removed at the end. Instead, remove output that's no longer
needed between the individual tests, to keep the total disk usage down
lower.

Author: Michael Banck

Fix typo

Author: Michael Banck

Faster partition pruning

Add a new module backend/partitioning/partprune.c, implementing a more
sophisticated algorithm for partition pruning. The new module uses each
partition's "boundinfo" for pruning instead of constraint exclusion,
based on an idea proposed by Robert Haas of a "pruning program": a list
of steps generated from the query quals which are run iteratively to
obtain a list of partitions that must be scanned in order to satisfy
those quals.

At present, this targets planner-time partition pruning, but there exist
further patches to apply partition pruning at execution time as well.

This commit also moves some definitions from include/catalog/partition.h
to a new file include/partitioning/partbounds.h, in an attempt to
rationalize partitioning related code.

Authors: Amit Langote, David Rowley, Dilip Kumar
Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen.
Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp

Support new default roles with adminpack

This provides a newer version of adminpack which works with the newly
added default roles to support GRANT'ing to non-superusers access to
read and write files, along with related functions (unlinking files,
getting file length, renaming/removing files, scanning the log file
directory) which are supported through adminpack.

Note that new versions of the functions are required because an
environment might have an updated version of the library but still have
the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to
PUBLIC for the functions).

This patch also removes the long-deprecated alternative names for
functions that adminpack used to include and which are now included in
the backend, in adminpack v1.1. Applications using the deprecated names
should be updated to use the backend functions instead. Existing
installations which continue to use adminpack v1.0 should continue to
function until/unless adminpack is upgraded.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net

Add default roles for file/program access

This patch adds new default roles named 'pg_read_server_files',
'pg_write_server_files', 'pg_execute_server_program' which
allow an administrator to GRANT to a non-superuser role the ability to
access server-side files or run programs through PostgreSQL (as the user
the database is running as). Having one of these roles allows a
non-superuser to use server-side COPY to read, write, or with a program,
and to use file_fdw (if installed by a superuser and GRANT'd USAGE on
it) to read from files or run a program.

The existing misc file functions are also changed to allow a user with
the 'pg_read_server_files' default role to read any files on the
filesystem, matching the privileges given to that role through COPY and
file_fdw from above.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net

Remove explicit superuser checks in favor of ACLs

This removes the explicit superuser checks in the various file-access
functions in the backend, specifically pg_ls_dir(), pg_read_file(),
pg_read_binary_file(), and pg_stat_file(). Instead, EXECUTE is REVOKE'd
from public for these, meaning that only a superuser is able to run them
by default, but access to them can be GRANT'd to other roles.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net

Add memory context identifier to portal context

Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us

Rename MemoryContextCopySetIdentifier() for clarity

MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier

Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us

Enforce child constraints during COPY TO a partitioned table.

The previous coding inadvertently checked the constraints for the
partitioned table rather than the target partition, which could
lead to data in a partition that fails to satisfy some constraint
on that partition. This problem seems to date back to when
table partitioning was introduced; prior to that, there was only
one target table for a COPY, so the problem didn't occur, and the
code just didn't get updated.

Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat

Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp

Refactor PgFdwModifyState creation/destruction into separate functions.

Etsuro Fujita. The larger patch series of which this is a part has
been reviewed by Amit Langote, David Fetter, Maksim Milyutin,
Álvaro Herrera, Stephen Frost, and me.

Discussion: http://postgr.es/m/5A95487E.9050808@lab.ntt.co.jp

Split the SetSubscriptionRelState function into two

We don't actually need the insert-or-update logic, so it's clearer to
have separate functions for the inserting and updating.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>

Improve messaging during logical replication worker startup

In case the subscription is removed before the worker is fully started,
give a specific error message instead of the generic "cache lookup"
error.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>

Fix compiler warning about format truncation

Improve parse representation for MERGE

Separation of parser data structures from executor, as
requested by Tom Lane. Further improvements possible.

While there, implement error for multiple VALUES clauses via parser
to allow line number of error, as requested by Andres Freund.

Author: Pavan Deolasee

Discussion: https://www.postgresql.org/message-id/CABOikdPpqjectFchg0FyTOpsGXyPoqwgC==OLKWuxgBOsrDDZw@mail.gmail.com

Attempt to fix win32 build of pg_verify_checksums

S_ISLNK doesn't exist on Win32, instead we should use
pgwin32_is_junction().

Allow on-line enabling and disabling of data checksums

This makes it possible to turn checksums on in a live cluster, without
the previous need for dump/reload or logical replication (and to turn it
off).

Enabling checkusm starts a background process in the form of a
launcher/worker combination that goes through the entire database and
recalculates checksums on each and every page. Only when all pages have
been checksummed are they fully enabled in the cluster. Any failure of
the process will revert to checksums off and the process has to be
started.

This adds a new WAL record that indicates the state of checksums, so
the process works across replicated clusters.

Authors: Magnus Hagander and Daniel Gustafsson
Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin

doc: remove mention of the DMOZ catalog in ltree docs

Discussion: https://postgr.es/m/CAF4Au4xYem_W3KOuxcKct7=G4j8Z3uO9j3DUKTFJqUsfp_9pQg@mail.gmail.com

Author: Oleg Bartunov

Backpatch-through: 9.3

MERGE syntax diagram correction

Reported-by: Andrew Gierth

PL/pgSQL: Add support for SET TRANSACTION

A normal SQL command run inside PL/pgSQL acquires a snapshot, but SET
TRANSACTION does not work anymore if a snapshot is set. So we have to
handle this separately.

Reviewed-by: Alexander Korotkov <a.korotkov@postgrespro.ru>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>

Allow cpluspluscheck to pass by renaming variable

Use of a C++ keyword as a function name caused problems

Reported-by: Álvaro Herrera

Fix plan cache issue in PL/pgSQL CALL

If we are not going to save the plan, then we need to unset expr->plan
after we are done, also in error cases.  Otherwise, we get a dangling
pointer next time around.

This is not the ideal solution.  It would be better if we could convince
SPI not to associate a cached plan with a resource owner, and then we
could just save the plan in all cases.  But that would require bigger
surgery.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>

Fix worker_spi for new parameter to initialize connection

Missed in previous commit.

Spotted by Teodor and the buildfarm

Remove tsearch test contained russian characters, missed in
1664ae1978bf0f5ee940dc2fc8313e6400a7e7da

Allow background workers to bypass datallowconn

THis adds a "flags" field to the BackgroundWorkerInitializeConnection()
and BackgroundWorkerInitializeConnectionByOid(). For now only one flag,
BGWORKER_BYPASS_ALLOWCONN, is defined, which allows the worker to ignore
datallowconn.

Add websearch_to_tsquery

Error-tolerant conversion function with web-like syntax for search query,
it simplifies constraining search engine with close to habitual interface for
users.

Bump catalog version

Authors: Victor Drobny, Dmitry Ivanov with editorization by me
Reviewed by: Aleksander Alekseev, Tomas Vondra, Thomas Munro, Aleksandr Parfenov
Discussion: https://www.postgresql.org/message-id/flat/fe931111ff7e9ad79196486ada79e268@postgrespro.ru

Add missing include

Newly added prototype broke cpluspluscheck.

Minor buglet in commit 8694cc96b52a.

Add support of bool, bpchar, name and uuid to btree_gin

Mostly for completeness, but I believe there are cases to use that in
multicolumn GIN indexes.

Bump btree_gin module version

Author: Matheus Oliveira
Reviewed by: Tomas Vondra
Discussion: https://www.postgresql.org/message-id/flat/CAJghg4LMJf6Z13fnZD-MBNiGxzd0cA2=F3TDjNkX3eQH58hktQ@mail.gmail.com

Fix handling of non-upgraded B-tree metapages

857f9c36 bumps B-tree metapage version while upgrade is performed "on the fly"
when needed. However, some asserts fired when old version metapage was
cached to rel->rd_amcache. Despite new metadata fields are never used from
rel->rd_amcache, that needs to be fixed. This patch introduces metadata
upgrade during its caching, which fills unavailable fields with their default
values. contrib/pageinspect is also patched to handle non-upgraded metapages
in the same way.

Author: Alexander Korotkov

MERGE minor errata

MERGE fix variable warning in non-assert builds

Author: Jesper Pedersen

MERGE INSERT allows only one VALUES clause

Doc syntax and brief mention of restriction

Remove unused vars and mark assert-only vars

Kyotaro HORIGUCHI

Fix misprint in documentation

Masahiko Sawada

Fix typo

Masahiko Sawada

MERGE post-commit review

Review comments from Andres Freund

* Consolidate code into AfterTriggerGetTransitionTable()
* Rename nodeMerge.c to execMerge.c
* Rename nodeMerge.h to execMerge.h
* Move MERGE handling in ExecInitModifyTable()
into a execMerge.c ExecInitMerge()
* Move mt_merge_subcommands flags into execMerge.h
* Rename opt_and_condition to opt_merge_when_and_condition
* Wordsmith various comments

Author: Pavan Deolasee
Reviewer: Simon Riggs

Install errcodes.txt for use by extensions.

Maintainers of out-of-tree PLs typically need access to the set of
error codes. To avoid the need to duplicate that information in some
form in PL source trees, provide errcodes.txt as part of a server
installation.

Thomas Munro, based on a suggestion from Andrew Gierth
Discussion: https://postgr.es/m/87woykk7mu.fsf%40news-spur.riddles.org.uk

doc: Improve indentation of SQL examples

Some of these were indented using 8 spaces whereas the rest uses 4
spaces. Probably originally some difference in tab size.

Restore erroneously removed ONLY from PK check

This is a blind fix, since I don't have SE-Linux to verify it.

Per unwanted change in rhinoceros, running sepgsql tests. Noted by Tom
Lane.

Discussion: https://postgr.es/m/32347.1522865050@sss.pgh.pa.us

Rewrite pg_dump TAP tests

This reworks how the tests to run are defined.  Instead of having to
define all runs for all tests, we define those tests which should pass
(generally using one of the defined broad hashes), add in any which
should be specific for this test, and exclude any specific runs that
shouldn't pass for this test.  This ends up removing some 4k+ lines
(more than half the file) but, more importantly, greatly simplifies the
way runs-to-be-tested are defined.

As discussed in the updated comments, for example, take the test which
does CREATE TABLE test_table.  That CREATE TABLE should show up in all
'full' runs of pg_dump, except those cases where 'test_table' is
excluded, of course, and that's exactly how the test gets defined now
(modulo a few other related cases, like where we dump only that table,
or we dump the schema it's in, or we exclude the schema it's in):

like => {
    %full_runs,
    %dump_test_schema_runs,
    only_dump_test_table    => 1,
    section_pre_data        => 1, },
unlike => {
    exclude_dump_test_schema => 1,
    exclude_test_table => 1, }, },

Next, we no longer expect every run to be listed for every test.  If a
run is listed in 'like' (directly or through a hash) then it's a 'like',
unless it's listed in 'unlike' in which case it's an 'unlike'.  If it
isn't listed in either, then it's considered an 'unlike' automatically.

Lastly, this changes the code to no longer use like/unlike but rather to
use 'ok()' with 'diag()' which allows much more control over what gets
spit out to the screen.  Gone are the days of the entire dump being sent
to the console, now you'll just get a couple of lines for each failing
test which say the test that failed and the run that it failed on.

This covers both the pg_dump TAP tests in src/bin/pg_dump and those in
src/test/modules/test_pg_dump.

docs: update ltree URL for the DMOZ catalog

Reported-by: bbrincat@gmail.com
Discussion: https://postgr.es/m/152283596377.1441.11672249301622760943@wrigleys.postgresql.org

Author: Oleg Bartunov

Backpatch-through: 9.3

Improve FSM management for BRIN indexes.

BRIN indexes like to propagate additions of free space into the upper pages
of their free space maps as soon as the new space is known, even when it's
just on one individual index page.  Previously this required calling
FreeSpaceMapVacuum, which is quite an expensive thing if the map is large.
Use the FreeSpaceMapVacuumRange function recently added by commit c79f6df75
to reduce the amount of work done for this purpose.

Fix a couple of places that neglected to do the upper-page vacuuming at all
after recording new free space.  If the policy is to be that BRIN should do
that, it should do it everywhere.

Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do
FreeSpaceMapVacuum unconditionally in brin_vacuum_scan.  Because of the
FSM's imprecise storage of free space, the old complications here seldom
bought anything, they just slowed things down.  This approach also
provides a predictable path for FSM corruption to be repaired.

Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer
where it's about to return an extended page to the caller.  The caller
should do that, instead, after it's inserted its new tuple.  Fix the
one caller that forgot to do so.

Simplify logic in brin_doupdate's same-page-update case by postponing
brin_initialize_empty_new_buffer to after the critical section; I see
little point in doing it before.

Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan.
Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in
a couple of places where we already had the right values.

Move a BRIN_elog debug logging call out of a critical section; that's
pretty unsafe and I don't think it buys us anything to not wait till
after the critical section.

Move the "*extended = false" step in brin_getinsertbuffer into the
routine's main loop.  There's no actual bug there, since the loop can't
iterate with *extended still true, but it doesn't seem very future-proof
as coded; and it's certainly not documented as a loop invariant.

This is all from follow-on investigation inspired by commit c79f6df75.

Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us

Foreign keys on partitioned tables

Author: Álvaro Herrera
Discussion: https://postgr.es/m/20171231194359.cvojcour423ulha4@alvherre.pgsql
Reviewed-by: Peter Eisentraut

Skip full index scan during cleanup of B-tree indexes when possible

Vacuum of index consists from two stages: multiple (zero of more) ambulkdelete
calls and one amvacuumcleanup call. When workload on particular table
is append-only, then autovacuum isn't intended to touch this table. However,
user may run vacuum manually in order to fill visibility map and get benefits
of index-only scans. Then ambulkdelete wouldn't be called for indexes
of such table (because no heap tuples were deleted), only amvacuumcleanup would
be called In this case, amvacuumcleanup would perform full index scan for
two objectives: put recyclable pages into free space map and update index
statistics.

This patch allows btvacuumclanup to skip full index scan when two conditions
are satisfied: no pages are going to be put into free space map and index
statistics isn't stalled. In order to check first condition, we store
oldest btpo_xact in the meta-page. When it's precedes RecentGlobalXmin, then
there are some recyclable pages. In order to check second condition we store
number of heap tuples observed during previous full index scan by cleanup.
If fraction of newly inserted tuples is less than
vacuum_cleanup_index_scale_factor, then statistics isn't considered to be
stalled. vacuum_cleanup_index_scale_factor can be defined as both reloption and GUC (default).

This patch bumps B-tree meta-page version. Upgrade of meta-page is performed
"on the fly": during VACUUM meta-page is rewritten with new version. No special
handling in pg_upgrade is required.

Author: Masahiko Sawada, Alexander Korotkov
Review by: Peter Geoghegan, Kyotaro Horiguchi, Alexander Korotkov, Yura Sokolov
Discussion: https://www.postgresql.org/message-id/flat/CAD21AoAX+d2oD_nrd9O2YkpzHaFr=uQeGr9s1rKC3O4ENc568g@mail.gmail.com

Remove less-portable-than-believed test case.

In commit 331b2369c I added a test to see what jsonb_plperl would do
with a qr{} result. Turns out the answer is Perl version dependent.
That fact doesn't bother me particularly, but coping with multiple
result possibilities is way more work than this test seems worth.
So remove it again.

Discussion: https://postgr.es/m/E1f3MMJ-0006bf-B0@gemulon.postgresql.org

Fix platform and Perl-version dependencies in new jsonb_plperl code.

Testing SvTYPE() directly is more fraught with problems than one might
think, because depending on context Perl might be storing a scalar value
in one of several forms, eg both numeric and string values.  This resulted
in Perl-version-dependent buildfarm test failures.  Instead use the SvTYPE
test only to distinguish non-scalar cases (AV, HV, NULL).  Disambiguate
scalars by testing SvIOK, SvNOK, then SvPOK.  This creates a preference
order for how we will resolve cases where the value is available in more
than one form, which seems fine to me.

Furthermore, because we're now dealing directly with a "double" value
in the SvNOK case, we can get rid of an inadequate and unportable
string-comparison test for infinities, and use isinf() instead.
(We do need some additional #include and "-lm" infrastructure to use
that in a contrib module, per prior experiences.)

In passing, prevent the regression test results from depending on DROP
CASCADE order; I've not seen that malfunction, but it's trouble waiting
to happen.

Discussion: https://postgr.es/m/E1f3MMJ-0006bf-B0@gemulon.postgresql.org