granicus.if.org Git - postgresql/log

Make CppAsString2() more visible in c.h.

For some reason this standard C string-processing hack was buried in an
NLS-related section of c.h. Put it beside CppAsString() so that people
are more likely to find it and not be tempted to reinvent local copies,
as I nearly did. And provide a more helpful comment, too.

Throw an error if a DATA() line contains wrong # of attributes.

David Christensen, reviewed by Dagfinn Ilmari Mannsåker

Discussion: http://postgr.es/m/20170215154018.fs5vwtqhp5d2sifs@veeddeux.attlocal.net

Use group updates when setting transaction status in clog.

Commit 0e141c0fbb211bdd23783afa731e3eef95c9ad7a introduced a mechanism
to reduce contention on ProcArrayLock by having a single process clear
XIDs in the procArray on behalf of multiple processes, reducing the
need to hand the lock around.  Use a similar mechanism to reduce
contention on CLogControlLock.  Testing shows that this very
significantly reduces the amount of time waiting for CLogControlLock
on high-concurrency pgbench tests run on a large multi-socket
machines; whether that translates into a TPS improvement depends on
how much of that contention is simply shifted to some other lock,
particularly WALWriteLock.

Amit Kapila, with some cosmetic changes by me.  Extensively reviewed,
tested, and benchmarked over a period of about 15 months by Simon
Riggs, Robert Haas, Andres Freund, Jesper Pedersen, and especially by
Tomas Vondra and Dilip Kumar.

Discussion: http://postgr.es/m/CAA4eK1L_snxM_JcrzEstNq9P66++F4kKFce=1r5+D1vzPofdtg@mail.gmail.com
Discussion: http://postgr.es/m/CAA4eK1LyR2A+m=RBSZ6rcPEwJ=rVi1ADPSndXHZdjn56yqO6Vg@mail.gmail.com
Discussion: http://postgr.es/m/91d57161-d3ea-0cc2-6066-80713e4f90d7@2ndquadrant.com

Fix timestamptz regression test to still work with latest IANA zone data.

The IANA timezone crew continues to chip away at their project of removing
timezone abbreviations that have no real-world currency from their
database.  The tzdata2017a update removes all such abbreviations for
South American zones, as well as much of the Pacific.  This breaks some
test cases in timestamptz.sql that were expecting America/Santiago and
America/Caracas to have non-numeric abbreviations.

The test cases involving America/Santiago seem to have selected that
zone more or less at random, so just replace it with America/New_York,
which is of similar longitude.  The cases involving America/Caracas are
harder since they were chosen to test a time-varying zone abbreviation
around a point where it changed meaning in the backwards direction.
Fortunately, Europe/Moscow has a similar case in 2014, and the MSK/MSD
abbreviations are well enough attested that IANA seems unlikely to
decide to remove them from the database in future.

With these changes, this regression test should pass when using any IANA
zone database from 2015 or later.  One could wish that there were a few
years more daylight on how out-of-date your zone database can be ... but
really the --with-system-tzdata option is only meant for use on platforms
where the zone database is kept up-to-date pretty faithfully, so I do not
think this is a big objection.

Discussion: https://postgr.es/m/6749.1489087470@sss.pgh.pa.us

Add relkind checks to certain contrib modules

The contrib extensions pageinspect, pg_visibility and pgstattuple only
work against regular relations which have storage. They don't work
against foreign tables, partitioned (parent) tables, views, et al.

Add checks to the user-callable functions to return a useful error
message to the user if they mistakenly pass an invalid relation to a
function which doesn't accept that kind of relation.

In passing, improve some of the existing checks to use ereport() instead
of elog(), add a function to consolidate common checks where
appropriate, and add some regression tests.

Author: Amit Langote, with various changes by me
Reviewed by: Michael Paquier and Corey Huinker
Discussion: https://postgr.es/m/ab91fd9d-4751-ee77-c87b-4dd704c1e59c@lab.ntt.co.jp

Document lack of validation when attaching foreign partitions.

Ashutosh Bapat, revised a bit by me.

Discussion: http://postgr.es/m/CAFjFpRdLaCa-1wJase0=YWG5o3cJnbuUt_vrqm2TDBKM_vQ_oA@mail.gmail.com

Document some new parallel query capabilities.

This updates the text for parallel index scan, parallel index-only
scan, parallel bitmap heap scan, and parallel merge join. It also
expands the discussion of parallel joins slightly.

Discussion: http://postgr.es/m/CA+TgmoZnCUoM31w3w7JSakVQJQOtcuTyX=HLUr-X1rto2=2bjw@mail.gmail.com

Fix bug in parallel tidbitmap iteration.

Avoid computing idxpages[istate->spageptr] until after checking
that istate->spageptr is a legal index.

Dilip Kumar, per a report from David Rowley

Discussion: http://postgr.es/m/CAKJS1f8OtrHE+-P+=E=4ycnL29e9idZKuaTQ6o2MbhvGN9D8ig@mail.gmail.com

Fix a couple of planner bugs in Gather Merge.

Neha Sharma reported these to Rushabh Lathia just after I commit
355d3993c53ed62c5b53d020648e4fbcfbf5f155 went in. The patch is
Rushabh's, with input from me.

Use SQL standard error code for nextval

Enable replication connections by default in pg_hba.conf

initdb now initializes a pg_hba.conf that allows replication connections
from the local host, same as it does for regular connections. The
connecting user still needs to have the REPLICATION attribute or be a
superuser.

The intent is to allow pg_basebackup from the local host to succeed
without requiring additional configuration.

Michael Paquier <michael.paquier@gmail.com> and me

Add a Gather Merge executor node.

Like Gather, we spawn multiple workers and run the same plan in each
one; however, Gather Merge is used when each worker produces the same
output ordering and we want to preserve that output ordering while
merging together the streams of tuples from various workers.  (In a
way, Gather Merge is like a hybrid of Gather and MergeAppend.)

This works out to a win if it saves us from having to perform an
expensive Sort.  In cases where only a small amount of data would need
to be sorted, it may actually be faster to use a regular Gather node
and then sort the results afterward, because Gather Merge sometimes
needs to wait synchronously for tuples whereas a pure Gather generally
doesn't.  But if this avoids an expensive sort then it's a win.

Rushabh Lathia, reviewed and tested by Amit Kapila, Thomas Munro,
and Neha Sharma, and reviewed and revised by me.

Discussion: http://postgr.es/m/CAGPqQf09oPX-cQRpBKS0Gq49Z+m6KBxgxd_p9gX8CKk_d75HoQ@mail.gmail.com

Fix inclusions of c.h from .h files.

We have a project policy that every .c file should start by including
postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no
need for any .h file to explicitly include any of these. Fix a few
headers that were violating this policy by including c.h.

Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us

Fix inclusions of postgres_fe.h from .h files.

We have a project policy that every .c file should start by including
postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no
need for any .h file to explicitly include any of these. Fix a few
headers that were violating this policy by including postgres_fe.h.

Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us

Bring plpgsql into line with header inclusion policy.

We have a project policy that every .c file should start by including
postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no
need for any .h file to explicitly include any of these.  (The core
reason for this policy is to make it easy to verify that pg_config_os.h
is included before any system headers such as <stdio.h>; without that,
we have portability issues on some platforms due to variation in largefile
options across different modules in the backend.  Also, if .h files were
responsible for choosing which of these key headers to include, .h files
that need to be includable in either frontend or backend compiles would be
in trouble.)

plpgsql was blithely ignoring this policy, so whack it upside the head
until it complies.  I also chose to standardize on including plpgsql's
own .h files after all core-system headers that it pulls in.  That
could've been done either way, but this way seems saner.

Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us

Document intentional violations of header inclusion policy.

Although there are good reasons for our policy of including postgres.h
as the first #include in every .c file, never from .h files, there are
two places where it seems expedient to violate the policy because the
alternative is to modify externally-supplied .c files. (In the case
of the regexp library, the idea that it's externally-supplied is kind
of at odds with reality, but I haven't entirely given up hope that it
will become a standalone project some day.) Add some comments to make
it explicit that this is a policy violation and provide the reasoning.

In passing, move #include "miscadmin.h" out of regcomp.c and into
regcustom.h, which is where it should be if we're taking this reasoning
seriously at all.

Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us

Suppress compiler warning in slab.c.

Compilers that don't realize that elog(ERROR) doesn't return
complained that SlabRealloc() failed to return a value.

While at it, fix the rather muddled header comment for the function.

Per buildfarm.

Suppress compiler warning in non-USE_LIBXML builds.

Compilers that don't realize that ereport(ERROR) doesn't return
complained that XmlTableGetValue() failed to return a value.

Also, make XmlTableFetchRow's non-USE_LIBXML case look more like
the other ones. As coded, it could lead to "unreachable code"
warnings with USE_LIBXML enabled.

Oversights in commit fcec6caaf. Per buildfarm.

Put back <float.h> in a few files that need it for _isnan().

Further fallout from commit c29aff959: there are some files that need
<float.h>, and were getting it from datatype/timestamp.h, but it was not
apparent in my (tgl's) testing because the requirement for <float.h>
exists only on certain Windows toolchains.

Report and patch by David Rowley.

Discussion: https://postgr.es/m/CAKJS1f-BHceaFzZScFapDV48gUVM2CAOBfhkgffdqXzFb+kwew@mail.gmail.com

Expose explain's SUMMARY option

This exposes the existing explain summary option to users to allow them
to choose if they wish to have the planning time and totalled run time
included in the EXPLAIN result. The existing default behavior is
retained if SUMMARY is not specified- running explain without analyze
will not print the summary lines (just the planning time, currently)
while running explain with analyze will include the summary lines (both
the planning time and the totalled execution time).

Users who wish to see the summary information for plain explain can now
use: EXPLAIN (SUMMARY ON) query; Users who do not want to have the
summary printed for an analyze run can use:
EXPLAIN (ANALYZE ON, SUMMARY OFF) query;

With this, we can now also have EXPLAIN ANALYZE queries included in our
regression tests by using:
EXPLAIN (ANALYZE ON, TIMING OFF, SUMMARY off) query;

I went ahead and added an example of this, which will hopefully not make
the buildfarm complain.

Author: Ashutosh Bapat
Discussion: https://postgr.es/m/CAFjFpReE5z2h98U2Vuia8hcEkpRRwrauRjHmyE44hNv8-xk+XA@mail.gmail.com

Silence compiler warnings in BitmapHeapNext().

Same disease as 270d7dd8a5a7128fc2b859f3bf95e2c1fb45be79.

Use doubly-linked block lists in aset.c to reduce large-chunk overhead.

Large chunks (those too large for any palloc freelist) are managed as
separate blocks.  Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links.  This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context.  Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently.  But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.

To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.

In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.

Back-patch to 9.4 where reorderbuffer.c appeared.  We could take this
further back, but currently there's no evidence that it would be useful.

Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com

Support parallel bitmap heap scans.

The index is scanned by a single process, but then all cooperating
processes can iterate jointly over the resulting set of heap blocks.
In the future, we might also want to support using a parallel bitmap
index scan to set up for a parallel bitmap heap scan, but that's a
job for another day.

Dilip Kumar, with some corrections and cosmetic changes by me. The
larger patch set of which this is a part has been reviewed and tested
by (at least) Andres Freund, Amit Khandekar, Tushar Ahuja, Rafia
Sabih, Haribabu Kommi, Thomas Munro, and me.

Discussion: http://postgr.es/m/CAFiTN-uc4=0WxRGfCzs-xfkMYcSEWUC-Fon6thkJGjkh9i=13A@mail.gmail.com

Prevent logical rep workers with removed subscriptions from starting.

Any logical rep workers must have their subscription entries in
pg_subscription. To ensure this, we need to prevent the launcher
from starting new worker corresponding to the subscription that
DROP SUBSCRIPTION command is removing. To implement this,
previously LogicalRepLauncherLock was introduced and held until
the end of transaction running DROP SUBSCRIPTION. But using
LWLock for that purpose was not valid.

Instead, this commit changes DROP SUBSCRIPTION so that it takes
AccessExclusiveLock on pg_subscription, in order to ensure that
the launcher cannot see any subscriptions being removed. Also this
commit gets rid of LogicalRepLauncherLock.

Patch by me, reviewed by Petr Jelinek

Discussion: https://www.postgresql.org/message-id/CAHGQGwHPi8ky-yANFfe0sgmhKtsYcQLTnKx07bW9S7-Rn1746w@mail.gmail.com

Fix XMLTABLE on older libxml2

libxml2 older than 2.9.1 does not have xmlXPathSetContextNode (released
in 2013, so reasonable platforms have trouble). That function is fairly
trivial, so I have inlined it in the one added caller. This passes
tests on my machine; let's see what the buildfarm thinks about it.

Per joint complaint from Tom Lane and buildfarm.

Add tests for foreign partitions.

Amit Langote, reviewed by Ashutosh Bapat

Discussion: http://postgr.es/m/475dd52c-be4a-9b32-6d54-3044a00c93d9@lab.ntt.co.jp

Support XMLTABLE query expression

XMLTABLE is defined by the SQL/XML standard as a feature that allows
turning XML-formatted data into relational form, so that it can be used
as a <table primary> in the FROM clause of a query.

This new construct provides significant simplicity and performance
benefit for XML data processing; what in a client-side custom
implementation was reported to take 20 minutes can be executed in 400ms
using XMLTABLE.  (The same functionality was said to take 10 seconds
using nested PostgreSQL XPath function calls, and 5 seconds using
XMLReader under PL/Python).

The implemented syntax deviates slightly from what the standard
requires.  First, the standard indicates that the PASSING clause is
optional and that multiple XML input documents may be given to it; we
make it mandatory and accept a single document only.  Second, we don't
currently support a default namespace to be specified.

This implementation relies on a new executor node based on a hardcoded
method table.  (Because the grammar is fixed, there is no extensibility
in the current approach; further constructs can be implemented on top of
this such as JSON_TABLE, but they require changes to core code.)

Author: Pavel Stehule, Álvaro Herrera
Extensively reviewed by: Craig Ringer
Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com

Silence compiler warnings in tbm_prepare_shared_iterate().

Maybe Robert's compiler can convince itself that these variables are
never used uninitialized, but mine can't.

pg_waldump: Remove extra newline in error message

fatal_error() already prints out a trailing newline.

Fix connection leak in DROP SUBSCRIPTION command, take 2.

Commit 898a792eb8283e31efc0b6fcbc03bbcd5f7df667 fixed the connection
leak issue, but it was an unreliable way of bugfix. This bugfix was
assuming that walrcv_command() subroutine cannot throw an error,
but it's untenable assumption. For example, if it will be changed
so that an error is thrown, connection leak issue will happen again.

This patch ensures that the connection is closed even when
walrcv_command() subroutine throws an error.

Patch by me, reviewed by Petr Jelinek and Michael Paquier

Discussion: https://www.postgresql.org/message-id/2058.1487704345@sss.pgh.pa.us

Create INSTALL file via XSLT

As before, create an INSTALL.html file for processing with lynx, but use
xsltproc and a new XSLT stylesheet instead of jade and DSSSL.

Replacing jade with xsltproc removes jade from the requirements for
distribution building.

Reviewed-by: Magnus Hagander <magnus@hagander.net>

Remove inclusion of postgres.h from a few header files.

Thomas Munro, per project policy articuled by Andres Freund and
Tom Lane.

Discussion: http://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com

Fix parallel index and index-only scans to fall back to serial.

Parallel executor nodes can't assume that parallel execution will
happen in every case where the plan calls for it, because it might
not work out that way. However, parallel index scan and parallel
index-only scan failed to do the right thing here. Repair.

Amit Kapila, per a report from me.

Discussion: http://postgr.es/m/CAA4eK1Kq5qb_u2AOoda5XBB91vVWz90w=LgtRLgsssriS8pVTw@mail.gmail.com

tidbitmap: Support shared iteration.

When a shared iterator is used, each call to tbm_shared_iterate()
returns a result that has not yet been returned to any process
attached to the shared iterator. In other words, each cooperating
processes gets a disjoint subset of the full result set, but all
results are returned exactly once.

This is infrastructure for parallel bitmap heap scan.

Dilip Kumar. The larger patch set of which this is a part has been
reviewed and tested by (at least) Andres Freund, Amit Khandekar,
Tushar Ahuja, Rafia Sabih, Haribabu Kommi, and Thomas Munro.

Discussion: http://postgr.es/m/CAFiTN-uc4=0WxRGfCzs-xfkMYcSEWUC-Fon6thkJGjkh9i=13A@mail.gmail.com

Fix grammar

Reported by Jeremy Finzel

Fix segfault in ALTER PUBLICATION/SUBSCRIPTION RENAME

From: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>

hash: Refactor hash index creation.

The primary goal here is to move all of the related page modifications
to a single section of code, in preparation for adding write-ahead
logging. In passing, rename _hash_metapinit to _hash_init, since it
initializes more than just the metapage.

Amit Kapila. The larger patch series of which this is a part has been
reviewed and tested by Álvaro Herrera, Ashutosh Sharma, Mark Kirkwood,
Jeff Janes, and Jesper Pedersen.

Improve postgresql.conf.sample comments about parallel workers.

David Rowley, reviewed by Amit Kapila

Discussion: http://postgr.es/m/CAKJS1f8gPEUPscj6kSqpveMnnx9_3ZypzwsKstv+8atx6VmjBg@mail.gmail.com

Properly initialize variable.

Commit 3bc7dafa9bebbdaa1bbf0da0798d29a8bdaf6a8f forgot to do this.

Noted while experimenting with valgrind.

Invent start_proc parameters for PL/Tcl.

Define GUCs pltcl.start_proc and pltclu.start_proc.  When set to a
nonempty value at the time a new Tcl interpreter is created, the
parameterless pltcl or pltclu function named by the GUC is called to
allow user-controlled initialization to occur within the interpreter.
This is modeled on plv8's start_proc parameter, and also has much in
common with plperl's on_init feature.  It allows users to fully
replace the "modules" feature that was removed in commit 817f2a586.

Since an initializer function could subvert later Tcl code in nearly
arbitrary ways, mark both GUCs as SUSET for now.  It would be nice
to find a way to relax that someday; but the corresponding GUCs in
plperl are also SUSET, and there's not been much complaint.

Discussion: https://postgr.es/m/22067.1488046447@sss.pgh.pa.us

Clean up test_ifaddrs a bit.

We customarily #include <netinet/in.h> before <arpa/inet.h>; according
to our git history (cf commit 527f8babc) there used to be platform(s)
where <arpa/inet.h> didn't compile otherwise. That's probably not
really an issue anymore, but since test_ifaddrs.c is the one and only
place in our code that's not following that rule, bring it into line.
Also remove #include <sys/socket.h>, as that's duplicative given that
libpq/ifaddr.h does so (via pqcomm.h).

In passing, add a .gitignore file so nobody accidentally commits the
test_ifaddrs executable, as I nearly did.

I see no particular need to back-patch this, as it's just neatnik-ism
considering we don't build test_ifaddrs by default, or even document
it anywhere.

A collection of small fixes for the SCRAM patch.

* Add required #includes for htonl. Per buildfarm members pademelon/gaur.

* Remove unnecessary "#include <utils/memutils>".

* Fix checking for empty string in pg_SASL_init. (Reported by Peter
  Eisentraut and his compiler)

* Move code in pg_SASL_init to match the recent changes (commit ba005f193d)
  to pg_fe_sendauth() function, where it's copied from.

* Return value of malloc() was not checked for NULL in
  scram_SaltedPassword(). Fix by avoiding the malloc().

Consider parallel merge joins.

Commit 45be99f8cd5d606086e0a458c9c72910ba8a613d took the position
that performing a merge join in parallel was not likely to work out
well, but this conclusion was greeted with skepticism even at the
time. Whether it was true then or not, it's clearly not true any
more now that we have parallel index scan.

Dilip Kumar, reviewed by Amit Kapila and by me.

Discussion: http://postgr.es/m/CAFiTN-v3=cM6nyFwFGp0fmvY4=kk79Hq9Fgu0u8CSJ-EEq1Tiw@mail.gmail.com

Fix pgbench's failure to honor the documented long-form option "--builtin".

Not only did it not accept --builtin as a synonym for -b, but what it did
accept as a synonym was --tpc-b (huh?), which it got even further wrong
by marking as no_argument, so that if you did try that you got a core
dump. I suppose this is leftover from some early design for the new
switches added by commit 8bea3d221, but it's still pretty sloppy work.

Per bug #14580 from Stepan Pesternikov. Back-patch to 9.6 where the
error was introduced.

Report: https://postgr.es/m/20170307123347.25054.73207@wrigleys.postgresql.org

Give partitioned table "p" in regression tests a less generic name.

And don't drop it, so that we improve the coverage of the pg_upgrade
regression tests.

Amit Langote, per a gripe from Tom Lane

Discussion: http://postgr.es/m/9071.1488863082@sss.pgh.pa.us

Fix relcache reference leak.

Reported by Kevin Grittner. Faulty commit identified by Tom Lane.
Patch by Amit Langote, reviewed by Michael Paquier.

Discussion: http://postgr.es/m/CACjxUsOHbH1=99u8mGxmLHfy5hov4ENEpvM6=3ARjos7wG7rtQ@mail.gmail.com

Remove duplicated word.

Amit Langote

Document what values postgres_fdw sets for each parameter it sets.

David Rader, reviewed by me.

Fix wrong word in comment.

Third time's the charm.

Remove vestigial grammar support for CHARACTER ... CHARACTER SET option.

The SQL standard says that you should be able to write "CHARACTER SET foo"
as part of the declaration of a char-type column. We don't implement that,
but a rough form of support has existed in gram.y since commit f10b63923.
That's now sat there for nigh 20 years without anyone fleshing it out ---
and even if someone did, the contemplated approach of having separate data
type name(s) for every character set certainly isn't what we'd do today.
Let's just remove the grammar production; if anyone is ever motivated to
work on this, reinventing the grammar support is a trivial fraction of
what they'd have to do. And we've never documented anything about
supporting such a clause.

Per gripe from Neha Khatri.

Discussion: https://postgr.es/m/CAFO0U+-iOS5oYN5v3SBuZvfhPUTRrkDFEx8w7H17B07Rwg3YUA@mail.gmail.com

Preparatory refactoring for parallel merge join support.

Extract the logic used by hash_inner_and_outer into a separate
function, get_cheapest_parallel_safe_total_inner, so that it can
also be used to plan parallel merge joins.

Also, add a require_parallel_safe argument to the existing function
get_cheapest_path_for_pathkeys, because parallel merge join needs
to find the cheapest path for a given set of pathkeys that is
parallel-safe, not just the cheapest one overall.

Patch by me, reviewed by Dilip Kumar.

Discussion: http://postgr.es/m/CA+TgmoYOv+dFK0MWW6366dFj_xTnohQfoBDrHyB7d1oZhrgPjA@mail.gmail.com

Fix parallel hash join path search.

When the very cheapest path is not parallel-safe, we want to instead use
the cheapest unparameterized path that is. The old code searched
innerrel->cheapest_parameterized_paths, but that isn't right, because
the path we want may not be in that list. Search innerrel->pathlist
instead.

Spotted by Dilip Kumar.

Discussion: http://postgr.es/m/CAFiTN-szCEcZrQm0i_w4xqSaRUTOUFstNu32Zn4rxxDcoa8gnA@mail.gmail.com

psql: Add \gx command

It can often be useful to use expanded mode output (\x) for just a
single query. Introduce a \gx which acts exactly like \g except that it
will force expanded output mode for that one \gx call. This is simpler
than having to use \x as a toggle and also means that the user doesn't
have to worry about the current state of the expanded variable, or
resetting it later, to ensure a given query is always returned in
expanded mode.

Primairly Christoph's patch, though I did tweak the documentation and help
text a bit, and re-indented the tab completion section.

Author: Christoph Berg
Reviewed By: Daniel Verite
Discussion: https://postgr.es/m/20170127132737.6skslelaf4txs6iw%40msg.credativ.de

Allow pg_dumpall to dump roles w/o user passwords

Add new option --no-role-passwords which dumps roles without passwords.
Since we don’t need passwords, we choose to use pg_roles in preference
to pg_authid since access may be restricted for security reasons in
some configrations.

Robins Tharakan and Simon Riggs

Fix comments in SCRAM-SHA-256 patch.

Amit Kapila.

Ensure ThisTimeLineID is valid before START_REPLICATION

Craig Ringer

Add regression tests for passwords.

Michael Paquier.

Support SCRAM-SHA-256 authentication (RFC 5802 and 7677).

This introduces a new generic SASL authentication method, similar to the
GSS and SSPI methods. The server first tells the client which SASL
authentication mechanism to use, and then the mechanism-specific SASL
messages are exchanged in AuthenticationSASLcontinue and PasswordMessage
messages. Only SCRAM-SHA-256 is supported at the moment, but this allows
adding more SASL mechanisms in the future, without changing the overall
protocol.

Support for channel binding, aka SCRAM-SHA-256-PLUS is left for later.

The SASLPrep algorithm, for pre-processing the password, is not yet
implemented. That could cause trouble, if you use a password with
non-ASCII characters, and a client library that does implement SASLprep.
That will hopefully be added later.

Authorization identities, as specified in the SCRAM-SHA-256 specification,
are ignored. SET SESSION AUTHORIZATION provides more or less the same
functionality, anyway.

If a user doesn't exist, perform a "mock" authentication, by constructing
an authentic-looking challenge on the fly. The challenge is derived from
a new system-wide random value, "mock authentication nonce", which is
created at initdb, and stored in the control file. We go through these
motions, in order to not give away the information on whether the user
exists, to unauthenticated users.

Bumps PG_CONTROL_VERSION, because of the new field in control file.

Patch by Michael Paquier and Heikki Linnakangas, reviewed at different
stages by Robert Haas, Stephen Frost, David Steele, Aleksander Alekseev,
and many others.

Discussion: https://www.postgresql.org/message-id/CAB7nPqRbR3GmFYdedCAhzukfKrgBLTLtMvENOmPrVWREsZkF8g%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqSMXU35g%3DW9X74HVeQp0uvgJxvYOuA4A-A3M%2B0wfEBv-w%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/55192AFE.6080106@iki.fi

Refactor SHA2 functions and move them to src/common/.

This way both frontend and backends can use them. The functions are taken
from pgcrypto, which now fetches the source files it needs from
src/common/.

A new interface is designed for the SHA2 functions, which allow linking
to either OpenSSL or the in-core stuff taken from KAME as needed.

Michael Paquier, reviewed by Robert Haas.

Discussion: https://www.postgresql.org/message-id/CAB7nPqTGKuTM5jiZriHrNaQeVqp5e_iT3X4BFLWY_HyHxLvySQ%40mail.gmail.com

pg_dump: Properly handle public schema ACLs with --clean

pg_dump has always handled the public schema in a special way when it
comes to the "--clean" option.  To wit, we do not drop or recreate the
public schema in "normal" mode, but when we are run in "--clean" mode
then we do drop and recreate the public schema.

When running in "--clean" mode, the public schema is dropped and then
recreated and it is recreated with the normal schema-default privileges
of "nothing".  This is unlike how the public schema starts life, which
is to have CREATE and USAGE GRANT'd to the PUBLIC role, and that is what
is recorded in pg_init_privs.

Due to this, in "--clean" mode, pg_dump would mistakenly only dump out
the set of privileges required to go from the initdb-time privileges on
the public schema to whatever the current-state privileges are.  If the
privileges were not changed from initdb time, then no privileges would
be dumped out for the public schema, but with the schema being dropped
and recreated, the result was that the public schema would have no ACLs
on it instead of what it should have, which is the initdb-time
privileges.

Practically speaking, this meant that pg_dump with --clean mode dumping
a database where the ACLs on the public schema were not changed from the
default would, upon restore, result in a public schema with *no*
privileges GRANT'd, not matching the state of the existing database
(where the initdb-time privileges would have been CREATE and USAGE to
the PUBLIC role for the public schema).

To fix, adjust the query in getNamespaces() to ignore the pg_init_privs
entry for the public schema when running in "--clean" mode, meaning that
the privileges for the public schema would be dumped, correctly, as if
it was going from a newly-created schema to the current state (which is,
indeed, what will happen during the restore thanks to the DROP/CREATE).

Only the public schema is handled in this special way by pg_dump, no
other initdb-time objects are dropped/recreated in --clean mode.

Back-patch to 9.6 where the bug was introduced.

Discussion: https://postgr.es/m/3534542.o3cNaKiDID%40techfox

Repair incorrect pg_dump labeling for some comments and security labels.

We attached no schema label to comments for procedural languages, casts,
transforms, operator classes, operator families, or text search objects.
The first three categories of objects don't really have schemas, but
pg_dump treats them as if they do, and it seems like the TocEntry fields
for their comments had better match the TocEntry fields for the parent
objects.  (As an example of a possible hazard, the type names in a CAST
will be formatted with the assumption of a particular search_path, so
failing to ensure that this same path is active for the COMMENT ON command
could lead to an error or to attaching the comment to the wrong cast.)
In the last six cases, this was a flat-out error --- possibly mine to
begin with, but it was a long time ago.

The security label for a procedural language was likewise not correctly
labeled as to schema, and both the comment and security label for a
procedural language were not correctly labeled as to owner.

In simple cases the restore would accidentally work correctly anyway, since
these comments and security labels would normally get emitted right after
the owning object, and so the search path and active user would be correct
anyhow.  But it could fail in corner cases; for example a schema-selective
restore would omit comments it should include.

Giuseppe Broccolo noted the oversight, and proposed the correct fix, for
text search dictionary objects; I found the rest by cross-checking other
dumpComment() calls.  These oversights are ancient, so back-patch all
the way.

Discussion: https://postgr.es/m/CAFzmHiWwwzLjzwM4x5ki5s_PDMR6NrkipZkjNnO3B0xEpBgJaA@mail.gmail.com

Make simplehash.h grow hashtable in additional cases.

Increase the size when either the distance between actual and optimal
slot grows too large, or when too many subsequent entries would have
to be moved.

This addresses reports that the simplehash performed, sometimes
considerably, worse than dynahash.

The reason turned out to be that insertions into the hashtable where,
due to the use of parallel query, in effect done from another
hashtable, in hash-value order.  If the target hashtable, due to
mis-estimation, was sized a lot smaller than the source table(s) that
lead to very imbalanced tables; a lot of entries in many close-by
buckets from the source tables were inserted into a single, wider,
bucket on the target table.  As the growth factor was solely computed
based on the fillfactor, the performance of the table decreased
further and further.

b81b5a96f424531b was an attempt to address this problem for hash
aggregates (but not for bitmap scans), but it turns out that the
current method of mixing hash values often actually leaves neighboring
hash-values close to each other, just in different value range.  It
might be worth revisiting that independently of the performance issues
addressed in this patch..

To address that problem resize tables in two additional cases: Firstly
when the optimal position for an entry would be far from the actual
position, secondly when many entries would have to be moved to make
space for the new entry (while satisfying the robin hood property).

Due to the additional resizing threshold it seems possible, and
testing confirms that so far, that a higher fillfactor doesn't hurt
performance and saves a bit of memory.  It seems better to increase it
now, before a release containing any of this code, rather than wonder
in some later release.

The various boundaries aren't determined in a particularly scientific
manner, they might need some fine-tuning.

In all my tests the new code now, even with parallelism, performs at
least as good as the old code, in several scenarios significantly
better.

Reported-By: Dilip Kumar, Robert Haas, Kuntal Ghosh
Discussion:
    https://postgr.es/m/CAFiTN-vagvuAydKG9VnWcoK=ADAhxmOa4ZTrmNsViBBooTnriQ@mail.gmail.com
    https://postgr.es/m/CAGz5QC+=fNTYgzMLTBUNeKt6uaWZFXJbkB5+7oWm-n9DwVxcLA@mail.gmail.com

pg_upgrade: Fix large object COMMENTS, SECURITY LABELS

When performing a pg_upgrade, we copy the files behind pg_largeobject
and pg_largeobject_metadata, allowing us to avoid having to dump out and
reload the actual data for large objects and their ACLs.

Unfortunately, that isn't all of the information which can be associated
with large objects.  Currently, we also support COMMENTs and SECURITY
LABELs with large objects and these were being silently dropped during a
pg_upgrade as pg_dump would skip everything having to do with a large
object and pg_upgrade only copied the tables mentioned to the new
cluster.

As the file copies happen after the catalog dump and reload, we can't
simply include the COMMENTs and SECURITY LABELs in pg_dump's binary-mode
output but we also have to include the actual large object definition as
well.  With the definition, comments, and security labels in the pg_dump
output and the file copies performed by pg_upgrade, all of the data and
metadata associated with large objects is able to be successfully pulled
forward across a pg_upgrade.

In 9.6 and master, we can simply adjust the dump bitmask to indicate
which components we don't want.  In 9.5 and earlier, we have to put
explciit checks in in dumpBlob() and dumpBlobs() to not include the ACL
or the data when in binary-upgrade mode.

Adjustments made to the privileges regression test to allow another test
(large_object.sql) to be added which explicitly leaves a large object
with a comment in place to provide coverage of that case with
pg_upgrade.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/20170221162655.GE9812@tamriel.snowman.net

Avoid dangling pointer to relation name in RLS code path in DoCopy().

With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE,
and would sometimes fail without that, because it used the relation name
directly from the relcache as part of the parsetree it's building.  That
becomes a potentially-dangling pointer as soon as the relcache entry is
closed, a bit further down.  Typical symptom if the relcache entry chanced
to get cleared would be "relation does not exist" error with a garbage
relation name, or possibly a core dump; but if you were really truly
unlucky, the COPY might copy from the wrong table.

Per report from Andrew Dunstan that regression tests fail with
-DRELCACHE_FORCE_RELEASE.  The core tests now pass for me (but have
not tried "make check-world" yet).

Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com

Combine several DROP variants into generic DropStmt

Combine DROP of FOREIGN DATA WRAPPER, SERVER, POLICY, RULE, and TRIGGER
into generic DropStmt grammar.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Allow dropping multiple functions at once

The generic drop support already supported dropping multiple objects of
the same kind at once.  But the previous representation
of function signatures across two grammar symbols and structure members
made this cumbersome to do for functions, so it was not supported.  Now
that function signatures are represented by a single structure, it's
trivial to add this support.  Same for aggregates and operators.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Replace LookupFuncNameTypeNames() with LookupFuncWithArgs()

The old function took function name and function argument list as
separate arguments. Now that all function signatures are passed around
as ObjectWithArgs structs, this is no longer necessary and can be
replaced by a function that takes ObjectWithArgs directly. Similarly
for aggregates and operators.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Remove objname/objargs split for referring to objects

In simpler times, it might have worked to refer to all kinds of objects
by a list of name components and an optional argument list.  But this
doesn't work for all objects, which has resulted in a collection of
hacks to place various other nodes types into these fields, which have
to be unpacked at the other end.  This makes it also weird to represent
lists of such things in the grammar, because they would have to be lists
of singleton lists, to make the unpacking work consistently.  The other
problem is that keeping separate name and args fields makes it awkward
to deal with lists of functions.

Change that by dropping the objargs field and have objname, renamed to
object, be a generic Node, which can then be flexibly assigned and
managed using the normal Node mechanisms.  In many cases it will still
be a List of names, in some cases it will be a string Value, for types
it will be the existing Typename, for functions it will now use the
existing ObjectWithArgs node type.  Some of the more obscure object
types still use somewhat arbitrary nested lists.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Add operator_with_argtypes grammar rule

This makes the handling of operators similar to that of functions and
aggregates.

Rename node FuncWithArgs to ObjectWithArgs, to reflect the expanded use.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Use class_args field in opclass_drop

This makes it consistent with the usage in opclass_item.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Fix incorrect comments.

Commit 19dc233c32f2900e57b8da4f41c0f662ab42e080 introduced these
comments. Michael Paquier noticed that one of them had a typo, but
a bigger problem is that they were not an accurate description of
what the code was doing.

Patch by me.

Mark pg_start_backup and pg_stop_backup as parallel-restricted.

They depend on backend-private state that will not be synchronized by
the parallel machinery, so they should not be marked parallel-safe.
This issue also exists in 9.6, but we obviously can't do anything
about 9.6 clusters that already exist. Possibly this could be
back-patched so that future 9.6 clusters would come out OK, or
possibly we should back-patch some other fix, but that would need more
discussion.

David Steele, reviewed by Michael Paquier

Discussion: http://postgr.es/m/CA+TgmoYCWfO2UM-t=HUMFJyxJywLDiLL0nAJpx88LKtvBvNECw@mail.gmail.com

Fix user-after-free bug.

Introduced by commit aea5d298362e881b13d95a48c5ae116879237389.

Patch from Amit Kapila. Issue discovered independently by Amit Kapila
and Ashutosh Sharma.

Reorder the asynchronous libpq calls for replication connection

Per libpq documentation, the initial state must be
PGRES_POLLING_WRITING. Failing to do that appears to cause some issues
on some Windows systems.

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>

Enhance docs for ALTER TABLE lock levels of storage parms

As requested by Robert Haas

Reduce lock levels for table storage params related to planning

The following parameters are now updateable with ShareUpdateExclusiveLock
effective_io_concurrency
parallel_workers
seq_page_cost
random_page_cost
n_distinct
n_distinct_inherited

Simon Riggs and Fabrízio Mello

Allow partitioned tables to be dropped without CASCADE

Record partitioned table dependencies as DEPENDENCY_AUTO
rather than DEPENDENCY_NORMAL, so that DROP TABLE just works.

Remove all the tests for partitioned tables where earlier
work had deliberately avoided using CASCADE.

Amit Langote, reviewed by Ashutosh Bapat and myself

In rebuild_relation(), don't access an already-closed relcache entry.

This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by
Andrew Dunstan, and could sometimes fail in normal operation, resulting
in a wrong persistence value being used for the transient table.
It's not immediately clear to me what effects that might have beyond
the risk of a crash while accessing OldHeap->rd_rel->relpersistence,
but it's probably not good.

Bug introduced by commit f41872d0c, and made substantially worse by
commit 85b506bbf, which added a second such access significantly
later than the heap_close. I doubt the first reference could fail
in a production scenario, but the second one definitely could.

Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com

pg_dump: Fix ordering

Materialized views refresh should be last.

From: Jim Nasby <Jim.Nasby@BlueTreble.com>

Disallow CREATE/DROP SUBSCRIPTION in transaction block

Disallow CREATE SUBSCRIPTION and DROP SUBSCRIPTION in a transaction
block when the replication slot is to be created or dropped, since that
cannot be rolled back.

based on patch by Masahiko Sawada <sawada.mshk@gmail.com>

Fix parsing of DROP SUBSCRIPTION ... DROP SLOT

It didn't actually parse before.

Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>

Fix two recently introduced grammar errors in mmgr/README.

These were introduced by me in f4e2d50c.

Reported-By: Tomas Vondra
Discussion: https://postgr.es/m/11adca69-be28-44bc-a801-64e6d53851e3@2ndquadrant.com

Fix typo

doc: Put callouts in SQL comments

This makes copy-and-pasting the SQL code easier.

From: Thomas Munro <thomas.munro@enterprisedb.com>

psql: Add tab completion for logical replication

Add tab completion for publications and subscriptions. Also, to be able
to get a list of subscriptions, make pg_subscription world-readable but
revoke access to subconninfo using column privileges.

From: Michael Paquier <michael.paquier@gmail.com>

Add RENAME support for PUBLICATIONs and SUBSCRIPTIONs

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>

Fix after trigger execution in logical replication

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Thom Brown <thom@linux.com>

Use asynchronous connect API in libpqwalreceiver

This makes the connection attempt from CREATE SUBSCRIPTION and from
WalReceiver interruptable by the user in case the libpq connection is
hanging. The previous coding required immediate shutdown (SIGQUIT) of
PostgreSQL in that situation.

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Thom Brown <thom@linux.com>

Allow vacuums to report oldestxmin

Allow VACUUM and Autovacuum to report the oldestxmin value they
used while cleaning tables, helping to make better sense out of
the other statistics we report in various cases.

Add pg_current_logfile() function.

The syslogger will write out the current stderr and csvlog names, if
it's running and there are any, to a new file in the data directory
called "current_logfiles".  We take care to remove this file when it
might no longer be valid (but not at shutdown).  The function
pg_current_logfile() can be used to read the entries in the file.

Gilles Darold, reviewed and modified by Karl O.  Pinc, Michael
Paquier, and me.  Further review by Álvaro Herrera and Christoph Berg.

Notify bgworker registrant after freeing worker slot.

Tom Lane observed buildfarm failures caused by the select_parallel
regression test trying to launch new parallel queries before the
worker slots used by the previous ones were freed.  Try to fix this by
having the postmaster free the worker slots before it sends the
SIGUSR1 notifications to the registering process.  This doesn't
completely eliminate the possibility that the user backend might
(correctly) observe the worker as dead before the slot is free, but I
believe it should make the window significantly narrower.

Patch by me, per complaint from Tom Lane.  Reviewed by Amit Kapila.

Discussion: http://postgr.es/m/30673.1487310734@sss.pgh.pa.us

Improve error reporting for tuple-routing failures.

Currently, the whole row is shown without column names. Instead,
adopt a style similar to _bt_check_unique() in ExecFindPartition()
and show the failing key: (key1, ...) = (val1, ...).

Amit Langote, per a complaint from Simon Riggs. Reviewed by me;
I also adjusted the grammar in one of the comments.

Discussion: http://postgr.es/m/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp

Correct old release note item

Update documentation of tsquery_phrase().

Missed in commit 028350f61. Noted by Eiji Seki.

Refactor bitmap heap scan in preparation for parallel support.

The final patch will be less messy if the prefetching support is
a bit better isolated, so do that.

Dilip Kumar, with some changes by me. The larger patch set of which
this is a part has been reviewed and tested by (at least) Andres
Freund, Amit Khandekar, Tushar Ahuja, Rafia Sabih, Haribabu Kommi, and
Thomas Munro.

Don't uselessly rewrite, truncate, VACUUM, or ANALYZE partitioned tables.

Also, recursively perform VACUUM and ANALYZE on partitions when the
command is applied to a partitioned table. In passing, some related
documentation updates.

Amit Langote, reviewed by Michael Paquier, Ashutosh Bapat, and by me.

Discussion: http://postgr.es/m/47288cf1-f72c-dfc2-5ff0-4af962ae5c1b@lab.ntt.co.jp

Update comments overlooked by 2f5c9d9c9cec436e55847ec580606d7e88067df6.

Tomas Vondra

Handle unaligned SerializeSnapshot() buffer.

Likewise in RestoreSnapshot().  Do so by copying between the user buffer
and a stack buffer of known alignment.  Back-patch to 9.6, where this
last applies cleanly.  In master, the select_parallel test dies with
SIGBUS on "Oracle Solaris 10 1/13 s10s_u11wos_24a SPARC", building
32-bit with gcc 4.9.2.  In 9.6 and 9.5, the buffers in question happen
to be sufficiently-aligned, and this change is mere insurance against
future 9.6 changes or extension code compromising that.

Create <sect3> in the functions-xml section

This is a small change so that a new XMLTABLE sect3 can be added easily
later.

Author: Craig Ringer
Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com

Fix timeouts in PostgresNode::psql

Newer Perl or IPC::Run versions default to appending the filename to string
exceptions, e.g. the exception

psql timed out

is thrown as

psql timed out at /usr/share/perl5/vendor_perl/IPC/Run.pm line 2961.

To handle this, match exceptions with !~ rather than ne.

From: Craig Ringer <craig@2ndquadrant.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>