Tom Lane [Fri, 13 May 2016 00:04:12 +0000 (20:04 -0400)]
Ensure plan stability in contrib/btree_gist regression test.
Buildfarm member skink failed with symptoms suggesting that an
auto-analyze had happened and changed the plan displayed for a
test query. Although this is evidently of low probability,
regression tests that sometimes fail are no fun, so add commands
to force a bitmap scan to be chosen.
Alvaro Herrera [Thu, 12 May 2016 19:02:49 +0000 (16:02 -0300)]
Fix bogus comments
Some comments mentioned XLogReplayBuffer, but there's no such function:
that was an interim name for a function that got renamed to
XLogReadBufferForRedo, before commit 2c03216d831160 was pushed.
Tom Lane [Wed, 11 May 2016 21:06:53 +0000 (17:06 -0400)]
Fix infer_arbiter_indexes() to not barf on system columns.
While it could be argued that rejecting system column mentions in the
ON CONFLICT list is an unsupported feature, falling over altogether
just because the table has a unique index on OID is indubitably a bug.
As far as I can tell, fixing infer_arbiter_indexes() is sufficient to
make ON CONFLICT (oid) actually work, though making a regression test
for that case is problematic because of the impossibility of setting
the OID counter to a known value.
Tom Lane [Wed, 11 May 2016 20:20:03 +0000 (16:20 -0400)]
Fix assorted missing infrastructure for ON CONFLICT.
subquery_planner() failed to apply expression preprocessing to the
arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the
theory was that this wasn't necessary because we don't actually try to
execute those expressions; but that's wrong, because it results in failure
to match to index expressions or index predicates that are changed at all
by preprocessing. Per bug #14132 from Reynold Smith.
Also add pullup_replace_vars processing for onConflictWhere. Perhaps
it's impossible to have a subquery reference there, but I'm not exactly
convinced; and even if true today it's a failure waiting to happen.
Also add some comments to other places where one or another field of
OnConflictExpr is intentionally ignored, with explanation as to why it's
okay to do so.
Also, catalog/dependency.c failed to record any dependency on the named
constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to
be dropped while rules exist that depend on it, and allowing pg_dump to
dump such a rule before the constraint it refers to. The normal execution
path managed to error out reasonably for a dangling constraint reference,
but ruleutils.c dumped core; so in addition to fixing the omission, add
a protective check in ruleutils.c, since we can't retroactively add a
dependency in existing databases.
Alvaro Herrera [Tue, 10 May 2016 19:23:54 +0000 (16:23 -0300)]
Fix autovacuum for shared relations
The table-skipping logic in autovacuum would fail to consider that
multiple workers could be processing the same shared catalog in
different databases. This normally wouldn't be a problem: firstly
because autovacuum workers not for wraparound would simply ignore tables
in which they cannot acquire lock, and secondly because most of the time
these tables are small enough that even if multiple for-wraparound
workers are stuck in the same catalog, they would be over pretty
quickly. But in cases where the catalogs are severely bloated it could
become a problem.
Backpatch all the way back, because the problem has been there since the
beginning.
Tom Lane [Sun, 8 May 2016 20:36:19 +0000 (16:36 -0400)]
Docs: create some user-facing documentation about index-only scans.
We didn't have any real user documentation about how index-only scans
work or how to design indexes to exploit them. Remedy that.
Per gripe from David Johnston.
Tom Lane [Sat, 7 May 2016 20:36:50 +0000 (16:36 -0400)]
In new pg_dump TAP tests, remove trailing "$" from regexps using /m.
It emerges that some Perl versions before 5.8.9 have a bug with regexps
that use the /m flag and contain "$". This is the reason why jacana
is still failing on HEAD, and I was able to duplicate the failure on
prairiedog's host. There's no real need for "$" in these patterns,
since they are already matching through the statement-terminating
semicolons (or matching an explicit \n in some cases). So just
remove it.
Note: the reason jacana hasn't actually reported any failures in the
last little while is that the way the pg_dump TAP tests are set up, any
failure of this sort results in echoing the entire pg_dump dump output
to stderr. Since there were about a hundred such failures, that resulted
in a 30MB log file which choked the buildfarm upload script. There is
room for improvement here :-(.
Tom Lane [Sat, 7 May 2016 17:16:50 +0000 (13:16 -0400)]
Docs: improve warnings about nextval() not producing gapless sequences.
In the documentation for nextval(), point out explicitly that INSERT ...
ON CONFLICT will call nextval() if needed for the insertion case, whether
or not it ends up following the ON CONFLICT path. This seems to be a
matter of some confusion, cf bug #14126, so let's be clear about it.
Also mention the issue in the CREATE SEQUENCE reference page, since that
is another place where people might expect such things to be covered.
Minor wording improvements nearby, as well.
Back-patch to 9.5 where ON CONFLICT was introduced.
Tom Lane [Sat, 7 May 2016 02:05:51 +0000 (22:05 -0400)]
Fix pg_upgrade to not fail when new-cluster TOAST rules differ from old.
This patch essentially reverts commit 4c6780fd17aa43ed, in favor of a much
simpler solution for the case where the new cluster would choose to create
a TOAST table but the old cluster doesn't have one: just don't create a
TOAST table.
The existing code failed in at least two different ways if the situation
arose: (1) ALTER TABLE RESET didn't grab an exclusive lock, so that the
lock sanity check in create_toast_table failed; (2) pg_upgrade did not
provide a pg_type OID for the new toast table, so that the crosscheck in
TypeCreate failed. While both these problems were introduced by later
patches, they show that the hack being used to cause TOAST table creation
is overwhelmingly fragile (and untested). I also note that before the
TypeCreate crosscheck was added, the code would have resulted in assigning
an indeterminate pg_type OID to the toast table, possibly causing a later
OID conflict in that catalog; so that it didn't really work even when
committed.
If we simply don't create a TOAST table, there will only be a problem if
the code tries to store a tuple that's wider than a page, and field
compression isn't sufficient to get it under a page. Given that the TOAST
creation threshold is intended to be about a quarter of a page, it's very
hard to believe that cross-version differences in the do-we-need-a-toast-
table heuristic could result in an observable problem. So let's just
follow the old version's conclusion about whether a TOAST table is needed.
(If we ever do change needs_toast_table() so much that this conclusion
doesn't apply, we can devise a solution at that time, and hopefully do
it in a less klugy way than 4c6780fd17aa43ed did.)
Stephen Frost [Sat, 7 May 2016 01:24:31 +0000 (21:24 -0400)]
Disable BLOB test in pg_dump TAP tests
Buildfarm member jacana appears to have an issue with running this
test. It's not entirely clear to me why, but rather than try to
fight with it, just disable it for now.
None of the other tests try to write out from psql directly as
this test does, so it seems likely that the rest of the tests will
be fine (as they have been on numerous other systems).
Kevin Grittner [Sat, 7 May 2016 01:05:29 +0000 (20:05 -0500)]
Mitigate "snapshot too old" performance regression on NUMA
Limit maintenance of time to xid mapping to once per minute. At
least in the tested case this brings performance within 5% of when
the feature is off, compared to several times slower without this
patch.
While there, fix comments and whitespace.
Ants Aasma, with cosmetic adjustments suggested by Andres Freund
Reviewed by Kevin Grittner and Andres Freund
Tom Lane [Fri, 6 May 2016 21:42:44 +0000 (17:42 -0400)]
Docs: minor copy-editing for GSSAPI/SSPI authentication docs.
Describe compat_realm = 0 as "disabled" not "enabled", per discussion
with Christian Ullrich. I failed to resist the temptation to do some
other minor copy-editing in the same area.
Stephen Frost [Fri, 6 May 2016 20:39:56 +0000 (16:39 -0400)]
Add test_pg_dump to @contrib_excludes
The test_pg_dump extension doesn't have a C component, so we need
to exclude it from the MSVC build system trying to figure out how
to build it.
Also add a "MODULES" line to the Makefile, as test_extensions has.
Might not be necessary, but seems good to keep things consistent.
Lastly, remove the 'installcheck' line from test_pg_dump, as that
was causing redefinition errors, at least on my box. This also
makes test_pg_dump consistent with how commit_ts is set up.
Stephen Frost [Fri, 6 May 2016 19:26:57 +0000 (15:26 -0400)]
Remove MODULES_big from test_pg_dump
The Makefile for test_pg_dump shouldn't have a MODULES_big line
because there's no actual compiled bit for that extension. Hopefully
this will fix the Windows buildfarm members which were complaining.
In passing, also add the 'prove_installcheck' bit to the pg_dump and
test_pg_dump Makefiles, to get the buildfarm members to actually run
those tests.
Robert Haas [Fri, 6 May 2016 19:00:55 +0000 (15:00 -0400)]
Minimal fix for crash bug in quals_match_foreign_key.
Discussion is still underway as to whether to revert the entire patch
that added this function, but that discussion may not conclude before
beta1. So, in the meantime, let's do at least this much.
Robert Haas [Fri, 6 May 2016 18:43:34 +0000 (14:43 -0400)]
Limit maximum parallel degree to 1024.
This new limit affects both the max_parallel_degree GUC and the
parallel_degree reloption. There may some day be a use case for using
more than 1024 CPUs for a single query, but that's surely not the case
right now. Not only do not very many people have that many CPUs, but
the code hasn't been tested at that kind of scale and is very unlikely
to perform well, or even work at all, without a lot more work. The
issue addressed by commit 06bd458cb812623c3f1fdd55216c4c08b06a8447 is
probably just one problem of many.
The idea of a more reasonable limit here was suggested by Tom Lane;
the value of 1024 was suggested by Amit Kapila.
Tom Lane [Fri, 6 May 2016 18:23:45 +0000 (14:23 -0400)]
Improve pg_upgrade's report about failure to match up old and new tables.
Ordinarily, pg_upgrade shouldn't have any difficulty in matching up all
the relations it sees in the old and new databases. If it does, however,
it just goes belly-up with a pretty unhelpful error message. That seemed
fine as long as we expected the case never to occur in the wild, but
Alvaro reported that it had been seen in a database whose pg_largeobject
table had somehow acquired a TOAST table. That doesn't quite seem like
a case that pg_upgrade actually needs to handle, but it would be good if
the report were more diagnosable. Hence, extend the logic to print out
as much information as we can about the mismatch(es) before we quit.
In passing, improve the readability of get_rel_infos()'s data collection
query, which had suffered seriously from lets-not-bother-to-update-comments
syndrome, and generally was unnecessarily disrespectful to readers.
It could be argued that this is a bug fix, but given that we have so few
reports, I don't feel a need to back-patch; at least not before this has
baked awhile in HEAD.
Robert Haas [Fri, 6 May 2016 18:23:47 +0000 (14:23 -0400)]
Use mul_size when multiplying by the number of parallel workers.
That way, if the result overflows size_t, you'll get an error instead
of undefined behavior, which seems like a plus. This also has the
effect of casting the number of workers from int to Size, which is
better because it's harder to overflow int than size_t.
Dilip Kumar reported this issue and provided a patch upon which this
patch is based, but his version did use mul_size.
Stephen Frost [Fri, 6 May 2016 18:06:50 +0000 (14:06 -0400)]
Remove various special checks around default roles
Default roles really should be like regular roles, for the most part.
This removes a number of checks that were trying to make default roles
extra special by not allowing them to be used as regular roles.
We still prevent users from creating roles in the "pg_" namespace or
from altering roles which exist in that namespace via ALTER ROLE, as
we can't preserve such changes, but otherwise the roles are very much
like regular roles.
Stephen Frost [Fri, 6 May 2016 18:06:50 +0000 (14:06 -0400)]
Add TAP tests for pg_dump
This TAP test suite will create a new cluster, populate it based on
the 'create_sql' values in the '%tests' hash, run all of the runs
defined in the '%pgdump_runs' hash, and then for each test in the
'%tests' hash, compare each run's output the the regular expression
defined for the test under the 'like' and 'unlike' functions, as
appropriate.
While this test suite covers a fair bit of ground (67% of pg_dump.c
and quite a bit of the other files in src/bin/pg_dump), there is
still quite a bit which remains to be added to provide better code
coverage. Still, this is quite a bit better than we had, and has
found a few bugs already (note that the CREATE TRANSFORM test is
commented out, as it is currently failing).
Idea for using the TAP system from Tom, though all of the code is mine.
Stephen Frost [Fri, 6 May 2016 18:06:50 +0000 (14:06 -0400)]
Only issue LOCK TABLE commands when necessary
Reviewing the cases where we need to LOCK a given table during a dump,
it was pointed out by Tom that we really don't need to LOCK a table if
we are only looking to dump the ACL for it, or certain other
components. After reviewing the queries run for all of the component
pieces, a list of components were determined to not require LOCK'ing
of the table.
This implements a check to avoid LOCK'ing those tables.
Initial complaint from Rushabh Lathia, discussed with Robert and Tom,
the patch is mine.
Stephen Frost [Fri, 6 May 2016 18:06:50 +0000 (14:06 -0400)]
pg_dump performance and other fixes
Do not try to dump objects which do not have ACLs when only ACLs are
being requested. This results in a significant performance improvement
as we can avoid querying for further information on these objects when
we don't need to.
When limiting the components to dump for an extension, consider what
components have been requested. Initially, we incorrectly hard-coded
the components of the extension objects to dump, which would mean that
we wouldn't dump some components even with they were asked for and in
other cases we would dump components which weren't requested.
Correct defaultACLs to use 'dump_contains' instead of 'dump'. The
defaultACL is considered a member of the namespace and should be
dumped based on the same set of components that the other objects in
the schema are, not based on what we're dumping for the namespace
itself (which might not include ACLs, if the namespace has just the
default or initial ACL).
Use DUMP_COMPONENT_ACL for from-initdb objects, to allow users to
change their ACLs, should they wish to. This just extends what we
are doing for the pg_catalog namespace to objects which are not
members of namespaces.
Due to column ACLs being treated a bit differently from other ACLs
(they are actually reset to NULL when all privileges are revoked),
adjust the query which gathers column-level ACLs to consider all of
the ACL-relevant columns.
Stephen Frost [Fri, 6 May 2016 18:06:50 +0000 (14:06 -0400)]
Correct pg_dump WHERE clause for functions/aggregates
The query to grab the function/aggregate information is now joining
to pg_init_privs, so we can simplify (and correct) the WHERE clause
used to determine if a given function's ACL has changed from the
initial ACL on the function.
Tom Lane [Fri, 6 May 2016 16:09:20 +0000 (12:09 -0400)]
Fix possible read past end of string in to_timestamp().
to_timestamp() handles the TH/th format codes by advancing over two input
characters, whatever those are. It failed to notice whether there were
two characters available to be skipped, making it possible to advance
the pointer past the end of the input string and keep on parsing.
A similar risk existed in the handling of "Y,YYY" format: it would advance
over three characters after the "," whether or not three characters were
available.
In principle this might be exploitable to disclose contents of server
memory. But the security team concluded that it would be very hard to use
that way, because the parsing loop would stop upon hitting any zero byte,
and TH/th format codes can't be consecutive --- they have to follow some
other format code, which would have to match whatever data is there.
So it seems impractical to examine memory very much beyond the end of the
input string via this bug; and the input string will always be in local
memory not in disk buffers, making it unlikely that anything very
interesting is close to it in a predictable way. So this doesn't quite
rise to the level of needing a CVE.
Tom Lane [Fri, 6 May 2016 15:01:05 +0000 (11:01 -0400)]
Improve handling of numeric-valued variables in pgbench.
The previous coding always stored variable values as strings, doing
conversion on-the-fly when a numeric value was needed or a number was to be
assigned. This was a bit inefficient and risked loss of precision for
floating-point values. The precision aspect had been hacked around by
printing doubles in "%.18e" format, which is ugly and has machine-dependent
results. Instead, arrange to preserve an assigned numeric value in the
original binary numeric format, converting to string only when and if
needed. When we do need to convert a double to string, convert in "%g"
format with DBL_DIG precision, which is the standard way to do it and
produces the least surprising results in most cases.
The implementation supports storing both a string value and a numeric
value for any one variable, with lazy conversion between them. I also
arranged for lazy re-sorting of the variable array when new variables are
added. That was mainly to allow a clean refactoring of putVariable()
into two levels of subroutine, but it may allow us to save a few sorts.
Kevin Grittner [Fri, 6 May 2016 12:47:12 +0000 (07:47 -0500)]
Fix hash index vs "snapshot too old" problemms
Hash indexes are not WAL-logged, and so do not maintain the LSN of
index pages. Since the "snapshot too old" feature counts on
detecting error conditions using the LSN of a table and all indexes
on it, this makes it impossible to safely do early vacuuming on any
table with a hash index, so add this to the tests for whether the
xid used to vacuum a table can be adjusted based on
old_snapshot_threshold.
While at it, add a paragraph to the docs for old_snapshot_threshold
which specifically mentions this and other aspects of the feature
which may otherwise surprise users.
Problem reported and patch reviewed by Amit Kapila
Dean Rasheed [Fri, 6 May 2016 11:48:27 +0000 (12:48 +0100)]
Fix psql's \ev and \sv commands so that they handle view reloptions.
Commit 8eb6407aaeb6cbd972839e356b436bb698f51cff added support for
editing and showing view definitions, but neglected to account for
view options such as security_barrier and WITH CHECK OPTION which are
not returned by pg_get_viewdef() and so need special handling.
Author: Dean Rasheed Reviewed-by: Peter Eisentraut
Discussion: http://www.postgresql.org/message-id/CAEZATCWZjCgKRyM-agE0p8ax15j9uyQoF=qew7D2xB6cF76T8A@mail.gmail.com
Dean Rasheed [Fri, 6 May 2016 11:45:36 +0000 (12:45 +0100)]
Move and rename fmtReloptionsArray().
Move fmtReloptionsArray() from pg_dump.c to string_utils.c so that it
is available to other frontend code. In particular psql's \ev and \sv
commands need it to handle view reloptions. Also rename the function
to appendReloptionsArray(), which is a more accurate description of
what it does.
Author: Dean Rasheed Reviewed-by: Peter Eisentraut
Discussion: http://www.postgresql.org/message-id/CAEZATCWZjCgKRyM-agE0p8ax15j9uyQoF=qew7D2xB6cF76T8A@mail.gmail.com
Tom Lane [Fri, 6 May 2016 02:37:30 +0000 (22:37 -0400)]
Further 9.6 release note improvements.
Call out the major enhancements in this release as identified by
pgsql-advocacy discussion, and rearrange some of the entries to
make those items more prominent. Other minor improvements per
advice from Vitaly Burovoy, Masahiko Sawada, Peter Geoghegan,
and Andres Freund.
Tom Lane [Fri, 6 May 2016 00:08:58 +0000 (20:08 -0400)]
Update time zone data files to tzdata release 2016d.
DST law changes in Russia (Magadan, Tomsk regions) and Venezuela.
Historical corrections for Russia. There are new zone names Europe/Kirov
and Asia/Tomsk reflecting the fact that these regions now have different
time zone histories from adjacent regions.
Tom Lane [Thu, 5 May 2016 18:51:00 +0000 (14:51 -0400)]
Rename pgbench min/max to least/greatest, and fix handling of double args.
These functions behave like the backend's least/greatest functions,
not like min/max, so the originally-chosen names invite confusion.
Per discussion, rename to least/greatest.
I also took it upon myself to make them return double if any input is
double. The previous behavior of silently coercing all inputs to int
surely does not meet the principle of least astonishment.
Copy-edit some of the other new functions' documentation, too.
Tom Lane [Thu, 5 May 2016 16:33:12 +0000 (12:33 -0400)]
Fix ordering/categorization of some recently-added system views.
Somebody added pg_replication_origin, pg_replication_origin_status and
pg_replication_slots to catalogs.sgml without a whole lot of concern for
either alphabetical order or the difference between a table and a view.
Clean up the mess.
Back-patch to 9.5, not so much because this is critical as because if
I don't it will result in a cross-branch divergence in release-9.5.sgml,
which would be a maintenance hazard.
Dean Rasheed [Thu, 5 May 2016 10:16:17 +0000 (11:16 +0100)]
Fix corner-case loss of precision in numeric pow() calculation
Commit 7d9a4737c268f61fb8800957631f12d3f13be218 greatly improved the
accuracy of the numeric transcendental functions, however it failed to
consider the case where the result from pow() is close to the overflow
threshold, for example 0.12 ^ -2345.6. For such inputs, where the
result has more than 2000 digits before the decimal point, the decimal
result weight estimate was being clamped to 2000, leading to a loss of
precision in the final calculation.
Fix this by replacing the clamping code with an overflow test that
aborts the calculation early if the final result is sure to overflow,
based on the overflow limit in exp_var(). This provides the same
protection against integer overflow in the subsequent result scale
computation as the original clamping code, but it also ensures that
precision is never lost and saves compute cycles in cases that are
sure to overflow.
The new early overflow test works with the initial low-precision
result (expected to be accurate to around 8 significant digits) and
includes a small fuzz factor to ensure that it doesn't kick in for
values that would not overflow exp_var(), so the overall overflow
threshold of pow() is unchanged and consistent for all inputs with
non-integer exponents.
Author: Dean Rasheed Reviewed-by: Tom Lane
Discussion: http://www.postgresql.org/message-id/CAEZATCUj3U-cQj0jjoia=qgs0SjE3auroxh8swvNKvZWUqegrg@mail.gmail.com
See-also: http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dJ8Q0zihXQT1semcQuTeK+4_rogC_zq5Hw@mail.gmail.com
This feature has shown enough immaturity that it was deemed better to
rip it out before rushing some more fixes at the last minute. There are
discussions on larger changes in this area for the next release.
Andres Freund [Wed, 4 May 2016 08:54:20 +0000 (01:54 -0700)]
Fix transient mdsync() errors of truncated relations due to 72a98a6395.
Unfortunately the segment size checks from 72a98a6395 had the negative
side-effect of breaking a corner case in mdsync(): When processing a
fsync request for a truncated away segment mdsync() could fail with
"could not fsync file" (if previous segment < RELSEG_SIZE) because
_mdfd_getseg() now wouldn't return the relevant segment anymore.
The cleanest fix seems to be to allow the caller of _mdfd_getseg() to
specify whether checks for RELSEG_SIZE are performed. To allow doing so,
change the ExtensionBehavior enum into a bitmask. Besides allowing for
the addition of EXTENSION_DONT_CHECK_SIZE, this makes for a nicer
implementation of EXTENSION_REALLY_RETURN_NULL.
Besides mdsync() the only callsite that should change behaviour due to
this is mdprefetch() which now doesn't create segments anymore, even in
recovery. Given the uses of mdprefetch() that seems better.
Reported-By: Thom Brown
Discussion: CAA-aLv72QazLvPdKZYpVn4a_Eh+i4_cxuB03k+iCuZM_xjc+6Q@mail.gmail.com
Robert Haas [Tue, 3 May 2016 18:36:38 +0000 (14:36 -0400)]
Fix more things to be parallel-safe.
Conversion functions were previously marked as parallel-unsafe, since
that is the default, but in fact they are safe. Parallel-safe
functions defined in pg_proc.h and redefined in system_views.sql were
ending up as parallel-unsafe because the redeclarations were not
marked PARALLEL SAFE. While editing system_views.sql, mark ts_debug()
parallel safe also.
Robert Haas [Tue, 3 May 2016 14:52:25 +0000 (10:52 -0400)]
Tweak a few more things in preparation for upcoming pgindent run.
These adjustments adjust code and comments in minor ways to prevent
pgindent from mangling them. Among other things, I tried to avoid
situations where pgindent would emit "a +b" instead of "a + b", and I
tried to avoid having it break up inline comments across multiple
lines.
Alvaro Herrera [Mon, 2 May 2016 19:04:29 +0000 (16:04 -0300)]
Fix code comments regarding logical decoding
Back in 3b02ea4f0780 I added some comments in various places to explain
how logical decoding and other things worked. Not all of the changes
were welcome, because they were misleading or wrong. This changes them
a little bit to make them more accurate.
Some other comments are also changed to be more accurate. Also, fix a
bunch of typos.
Tom Lane [Mon, 2 May 2016 15:18:10 +0000 (11:18 -0400)]
Fix configure's incorrect version tests for flex and perl.
awk's equality-comparison operator is "==" not "=". We got this right
in many places, but not in configure's checks for supported version
numbers of flex and perl. It hadn't been noticed because unsupported
versions are so old as to be basically extinct in the wild, and because
the only consequence is whether or not a WARNING flies by during
configure.
Daniel Gustafsson noted the problem with respect to the test for flex,
I found the other by reviewing other awk calls.
Robert Haas [Mon, 2 May 2016 14:42:34 +0000 (10:42 -0400)]
Fix parallel safety markings for pg_start_backup.
Commit 7117685461af50f50c03f43e6a622284c8d54694 made pg_start_backup
parallel-restricted rather than parallel-safe, because it now relies
on backend-private state that won't be synchronized with the parallel
worker. However, it didn't update pg_proc.h. Separately, Andreas
Karlsson observed that system_views.sql neglected to reiterate the
parallel-safety markings whe redefining various functions, including
this one; so add a PARALLEL RESTRICTED declaration there to match
the new value in pg_proc.h.
CHECK_PAGE_OFFSET_RANGE() has been unused forever.
CHECK_RELATION_BLOCK_RANGE() has been unused in pgstatindex.c ever since
bt_page_stats() and bt_page_items() functions were moved from pgstattuple
to pageinspect module. It still exists in pageinspect/btreefuncs.c.
Tom Lane [Sun, 1 May 2016 15:24:32 +0000 (11:24 -0400)]
Add a --non-master-only option to git_changelog.
This has the inverse effect of --master-only. It's needed to help find
cases where a commit should not be described in major release notes
because it was back-patched into older branches, though not at the same
time as the HEAD commit.
Tom Lane [Sat, 30 Apr 2016 18:08:00 +0000 (14:08 -0400)]
Small improvements to OPTIMIZER_DEBUG code.
Now that Paths have their own rows field, print that rather than
the parent relation's rowcount.
Show the relid sets associated with Paths using table names rather
than numbers; since this code is able to print simple Var references
using table names, it seems a bit silly that print_relids can't.
Print the cheapest_parameterized_paths list for a RelOptInfo, and
include information about a parameterized path's required_outer rels.
Noted while trying to use this feature to debug Alexander Kirkouski's
recent bug report.
Tom Lane [Sat, 30 Apr 2016 16:29:21 +0000 (12:29 -0400)]
Fix planner crash from pfree'ing a partial path that a GatherPath uses.
We mustn't run generate_gather_paths() during add_paths_to_joinrel(),
because that function can be invoked multiple times for the same target
joinrel. Not only is it wasteful to build GatherPaths repeatedly, but
a later add_partial_path() could delete the partial path that a previously
created GatherPath depends on. Instead establish the convention that we
do generate_gather_paths() for a rel only just before set_cheapest().
The code was accidentally not broken for baserels, because as of today there
never is more than one partial path for a baserel. But that assumption
obviously has a pretty short half-life, so move the generate_gather_paths()
calls for those cases as well.
Also add some generic comments explaining how and why this all works.
Tom Lane [Sat, 30 Apr 2016 14:54:45 +0000 (10:54 -0400)]
Remove warning about num_sync being too large in synchronous_standby_names.
If we're not going to reject such setups entirely, throwing a WARNING in
check_synchronous_standby_names() is unhelpful, because it will cause the
warning to be logged again every time the postmaster receives SIGHUP.
Per discussion, just remove the warning.
In passing, improve the documentation for synchronous_commit, which had not
gotten the word that now there can be more than one synchronous standby.
Tom Lane [Sat, 30 Apr 2016 00:19:38 +0000 (20:19 -0400)]
Fix mishandling of equivalence-class tests in parameterized plans.
Given a three-or-more-way equivalence class, such as X.Y = Y.Y = Z.Z,
it was possible for the planner to omit one of the quals needed to
enforce that all members of the equivalence class are actually equal.
This only happened in the case of a parameterized join node for two
of the relations, that is a plan tree like
Nested Loop
-> Scan X
-> Nested Loop
-> Scan Y
-> Scan Z
Filter: Z.Z = X.X
The eclass machinery normally expects to apply X.X = Y.Y when those
two relations are joined, but in this shape of plan tree they aren't
joined until the top node --- and, if the lower nested loop is marked
as parameterized by X, the top node will assume that the relevant eclass
condition(s) got pushed down into the lower node. On the other hand,
the scan of Z assumes that it's only responsible for constraining Z.Z
to match any one of the other eclass members. So one or another of
the required quals sometimes fell between the cracks, depending on
whether consideration of the eclass in get_joinrel_parampathinfo()
for the lower nested loop chanced to generate X.X = Y.Y or X.X = Z.Z
as the appropriate constraint there. If it generated the latter,
it'd erroneously suppose that the Z scan would take care of matters.
To fix, force X.X = Y.Y to be generated and applied at that join node
when this case occurs.
This is *extremely* hard to hit in practice, because various planner
behaviors conspire to mask the problem; starting with the fact that the
planner doesn't really like to generate a parameterized plan of the
above shape. (It might have been impossible to hit it before we
tweaked things to allow this plan shape for star-schema cases.) Many
thanks to Alexander Kirkouski for submitting a reproducible test case.
The bug can be demonstrated in all branches back to 9.2 where parameterized
paths were introduced, so back-patch that far.
Kevin Grittner [Fri, 29 Apr 2016 21:46:08 +0000 (16:46 -0500)]
Add a few entries to the tail of time mapping, to see old values.
Without a few entries beyond old_snapshot_threshold, the lookup
would often fail, resulting in the more aggressive pruning or
vacuum being skipped often enough to matter. This was very clearly
shown by a python test script posted by Ants Aasma, and was likely
a factor in an earlier but somewhat less clear-cut test case posted
by Jeff Janes.
This patch makes no change to the logic, per se -- it just makes
the array of mapping entries big enough to make lookup misses based
on timing much less likely. An occasional miss is still possible
if a thread stalls for more than 10 minutes, but that does not
create any problem with correctness of behavior. Besides, if
things are so busy that a thread is stalling for more than 10
minutes, it is probably OK to skip the more aggressive cleanup at
that particular point in time.
Andrew Dunstan [Fri, 29 Apr 2016 11:59:47 +0000 (07:59 -0400)]
Support building with Visual Studio 2015
Adjust the way we detect the locale. As a result the minumum Windows
version supported by VS2015 and later is Windows Vista. Add some tweaks
to remove new compiler warnings. Remove documentation references to the
now obsolete msysGit.
Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich.
Andres Freund [Fri, 29 Apr 2016 05:05:37 +0000 (22:05 -0700)]
Remember asking for feedback during walsender shutdown.
Since 5a991ef8 we're explicitly asking for feedback from the receiving
side when shutting down walsender, if there's not yet replicated
data.
Unfortunately we didn't remember (i.e. set waiting_for_ping_response to
true) having asked for feedback, leading to scenarios in which replies
were requested at a high frequency.
I can't reproduce this problem on my laptop, I think that's because the
problem requires a significant TCP window to manifest due to the
!pq_is_send_pending() condition. But since this clearly is a bug, let's
fix it. There's quite possibly more wrong than just this though.
While fiddling with WalSndDone(), I rewrote a hard to understand comment
about looking at the flush vs. the write position.
Reported-By: Nick Cleaton, Magnus Hagander
Author: Nick Cleaton
Discussion: CAFgz3kus=rC_avEgBV=+hRK5HYJ8vXskJRh8yEAbahJGTzF2VQ@mail.gmail.com
CABUevExsjROqDcD0A2rnJ6HK6FuKGyewJr3PL12pw85BHFGS2Q@mail.gmail.com
Backpatch: 9.4, were 5a991ef8 introduced the use of feedback messages
during shutdown.
Tom Lane [Thu, 28 Apr 2016 15:50:58 +0000 (11:50 -0400)]
Adjust DatumGetBool macro, this time for sure.
Commit 23a41573c attempted to fix the DatumGetBool macro to ignore bits
in a Datum that are to the left of the actual bool value. But it did that
by casting the Datum to bool; and on compilers that use C99 semantics for
bool, that ends up being a whole-word test, not a 1-byte test. This seems
to be the true explanation for contrib/seg failing in VS2015. To fix, use
GET_1_BYTE() explicitly. I think in the previous patch, I'd had some idea
of not having to commit to bool being exactly 1 byte wide, but regardless
of what the compiler's bool is, boolean columns and Datums are certainly
1 byte wide.
The previous fix was (eventually) back-patched into all active versions,
so do likewise with this one.
Tom Lane [Thu, 28 Apr 2016 15:46:07 +0000 (11:46 -0400)]
Revert "Convert contrib/seg's bool-returning SQL functions to V1 call convention."
This reverts commit c8e81afc60093b199a128ccdfbb692ced8e0c9cd.
That turns out to have been based on a faulty diagnosis of why the
VS2015 build was misbehaving. Instead, we need to fix DatumGetBool().
Prevent multiple cleanup process for pending list in GIN.
Previously, ginInsertCleanup could exit early if it detects that someone else
is cleaning up the pending list, without waiting for that someone else to
finish the job. But in this case vacuum could miss tuples to be deleted.
Cleanup process now locks metapage with a help of heavyweight
LockPage(ExclusiveLock), and it guarantees that there is no another cleanup
process at the same time. Lock is taken differently depending on caller of
cleanup process: any vacuums and gin_clean_pending_list() will be blocked
until lock becomes available, ordinary insert uses conditional lock to
prevent indefinite waiting on lock.
Insert into pending list doesn't use this lock, so insertion isn't blocked.
Also, patch adds stopping of cleanup process when at-start-cleanup-tail is
reached in order to prevent infinite cleanup in case of massive insertion. But
it will stop only for automatic maintenance tasks like autovacuum.
Patch introduces choice of limit of memory to use: autovacuum_work_mem,
maintenance_work_mem or work_mem depending on call path.
Patch for previous releases should be reworked due to changes between 9.6 and
previous ones in this area.
Discover and diagnostics by Jeff Janes and Tomas Vondra
Tom Lane [Wed, 27 Apr 2016 22:19:28 +0000 (18:19 -0400)]
Use memmove() not memcpy() to slide some pointers down.
The previous coding here was formally undefined, though it seems to
accidentally work on most platforms in the buildfarm. Caught by some
OpenBSD platforms in which libc contains an assertion check for
overlapping areas passed to memcpy().
Tom Lane [Wed, 27 Apr 2016 21:55:19 +0000 (17:55 -0400)]
Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Robert Haas [Wed, 27 Apr 2016 15:47:28 +0000 (11:47 -0400)]
Update typedefs.list file in preparation for pgindent run
In addition to adding new typedefs, I also re-sorted the file so that
various entries add piecemeal, mostly or entirely by me, were alphabetized
the same way as other entries in the file.
Robert Haas [Wed, 27 Apr 2016 15:29:45 +0000 (11:29 -0400)]
Clean up a few parallelism-related things that pgindent wants to mangle.
In nodeFuncs.c, pgindent wants to introduce spurious indentation into
the definitions of planstate_tree_walker and planstate_walk_subplans.
Fix that by spreading the definition out across several lines, similar
to what is already done for other walker functions in that file.
In execParallel.c, in the definition of SharedExecutorInstrumentation,
pgindent wants to insert more whitespace between the type name and the
member name. That causes it to mangle comments later on the line. Fix
by moving the comments out of line. Now that we have a bit more room,
add some more details that may be useful to the next person reading
this code.