Tom Lane [Mon, 23 Sep 2019 14:32:02 +0000 (10:32 -0400)]
Doc: clean up some issues with spellings of contributor names.
In the v12 contributors list, remove a couple of duplicates
that had crept in due to variant spellings of a person's name.
Try to standardize Japanese names as given-name-first.
Tom Lane [Sun, 22 Sep 2019 21:46:00 +0000 (17:46 -0400)]
Fix failure to zero-pad the result of bitshiftright().
If the bitstring length is not a multiple of 8, we'd shift the
rightmost bits into the pad space, which must be zeroes --- bit_cmp,
for one, depends on that. This'd lead to the result failing to
compare equal to what it should compare equal to, as reported in
bug #16013 from Daryl Waycott.
This is, if memory serves, not the first such bug in the bitstring
functions. In hopes of making it the last one, do a bit more work
than minimally necessary to fix the bug:
* Add assertion checks to bit_out() and varbit_out() to complain if
they are given incorrectly-padded input. This will improve the
odds that manual testing of any new patch finds problems.
* Encapsulate the padding-related logic in macros to make it
easier to use.
Also, remove unnecessary padding logic from bit_or() and bitxor().
Somebody had already noted that we need not re-pad the result of
bit_and() since the inputs are required to be the same length,
but failed to extrapolate that to the other two.
Also, move a comment block that once was near the head of varbit.c
(but people kept putting other stuff in front of it), to put it in
the header block.
Note for the release notes: if anyone has inconsistent data as a
result of saving the output of bitshiftright() in a table, it's
possible to fix it with something like
UPDATE mytab SET bitcol = ~(~bitcol) WHERE bitcol != ~(~bitcol);
This has been broken since day one, so back-patch to all supported
branches.
Tom Lane [Sun, 22 Sep 2019 18:21:07 +0000 (14:21 -0400)]
Fix typo in tts_virtual_copyslot.
The code used the destination slot's natts where it intended to
use the source slot's natts. Adding an Assert shows that there
is no case in "make check-world" where these counts are different,
so maybe this is a harmless bug, but it's still a bug.
Tom Lane [Sat, 21 Sep 2019 20:56:30 +0000 (16:56 -0400)]
Straighten out leakproofness markings on text comparison functions.
Since we introduced the idea of leakproof functions, texteq and textne
were marked leakproof but their sibling text comparison functions were
not. This inconsistency seemed justified because texteq/textne just
relied on memcmp() and so could easily be seen to be leakproof, while
the other comparison functions are far more complex and indeed can
throw input-dependent errors.
However, that argument crashed and burned with the addition of
nondeterministic collations, because now texteq/textne may invoke
the exact same varstr_cmp() infrastructure as the rest. It makes no
sense whatever to give them different leakproofness markings.
After a certain amount of angst we've concluded that it's all right
to consider varstr_cmp() to be leakproof, mostly because the other
choice would be disastrous for performance of many queries where
leakproofness matters. The input-dependent errors should only be
reachable for corrupt input data, or so we hope anyway; certainly,
if they are reachable in practice, we've got problems with requirements
as basic as maintaining a btree index on a text column.
Hence, run around to all the SQL functions that derive from varstr_cmp()
and mark them leakproof. This should result in a useful gain in
flexibility/performance for queries in which non-leakproofness degrades
the efficiency of the query plan.
Back-patch to v12 where nondeterministic collations were added.
While this isn't an essential bug fix given the determination
that varstr_cmp() is leakproof, we might as well apply it now that
we've been forced into a post-beta4 catversion bump.
Tom Lane [Sat, 21 Sep 2019 20:29:17 +0000 (16:29 -0400)]
Fix up handling of nondeterministic collations with pattern_ops opclasses.
text_pattern_ops and its siblings can't be used with nondeterministic
collations, because they use the text_eq operator which will not behave
as bitwise equality if applied with a nondeterministic collation. The
initial implementation of that restriction was to insert a run-time test
in the related comparison functions, but that is inefficient, may throw
misleading errors, and will throw errors in some cases that would work.
It seems sufficient to just prevent the combination during CREATE INDEX,
so do that instead.
Lacking any better way to identify the opclasses involved, we need to
hard-wire tests for them, which requires hand-assigned values for their
OIDs, which forces a catversion bump because they previously had OIDs
that would be assigned automatically. That's slightly annoying in the
v12 branch, but fortunately we're not at rc1 yet, so just do it.
Back-patch to v12 where nondeterministic collations were added.
In passing, run make reformat-dat-files, which found some unrelated
whitespace issues (slightly different ones in HEAD and v12).
Tom Lane [Sat, 21 Sep 2019 19:23:53 +0000 (15:23 -0400)]
Doc: updates for v12 release notes.
Remove mention of ECPG's DECLARE STATEMENT, since that was reverted
yesterday. Rewrite some other entries per suggestions from Peter
Eisentraut. Make a couple of desultory wording and markup adjustments.
Tom Lane [Fri, 20 Sep 2019 23:53:33 +0000 (19:53 -0400)]
Update time zone data files to tzdata release 2019c.
DST law changes in Fiji and Norfolk Island. Historical corrections
for Alberta, Austria, Belgium, British Columbia, Cambodia, Hong Kong,
Indiana (Perry County), Kaliningrad, Kentucky, Michigan, Norfolk
Island, South Korea, and Turkey.
Tom Lane [Fri, 20 Sep 2019 18:22:58 +0000 (14:22 -0400)]
Fix some minor spec-compliance issues in jsonpath lexer.
Although the SQL/JSON tech report makes reference to ECMAScript which
allows both single- and double-quoted strings, all the rest of the
report speaks only of double-quoted string literals in jsonpaths.
That's more compatible with JSON itself; moreover single-quoted strings
are hard to use inside a jsonpath that is itself a single-quoted SQL
literal. So guess that the intent is to allow only double-quoted
literals, and remove lexer support for single-quoted literals.
It'll be less painful to add this again later if we're wrong, than to
remove a shipped feature.
Also, adjust the lexer so that unrecognized backslash sequences are
treated as just meaning the escaped character, not as errors. This
change has much better support in the standards, as JSON, JavaScript
and ECMAScript all make it plain that that's what's supposed to
happen.
Tom Lane [Fri, 20 Sep 2019 16:47:21 +0000 (12:47 -0400)]
Revert "Add DECLARE STATEMENT support to ECPG."
This reverts commit bd7c95f0c1a38becffceb3ea7234d57167f6d4bf,
along with assorted follow-on fixes. There are some questions
about the definition and implementation of that statement, and
we don't have time to resolve them before v13 release. Rather
than ship the feature and then have backwards-compatibility
concerns constraining any redesign, let's remove it for now
and try again later.
Provide stable test for NULL-values in KNN SP-GiST
f5f084fc3e has removed test because of its instability. This commit provides
alternative test with determined ordering using extra ORDER BY expression.
6cae9d2c10 introduced test for NULL values in KNN SP-GiST. This test relies on
undetermined ordering showing different results on various platforms. This
commit removes that test. Will be replaced with better test later.
Fix freeing old values in index_store_float8_orderby_distances()
6cae9d2c10 has added an error in freeing old values in
index_store_float8_orderby_distances() function. It looks for old value in
scan->xs_orderbynulls[i] after setting a new value there.
This commit fixes that. Also it removes short-circuit in handling
distances == NULL situation. Now distances == NULL will be treated the same
way as array with all null distances. That is, previous values will be freed
if any.
Improve handling of NULLs in KNN-GiST and KNN-SP-GiST
This commit improves subject in two ways:
* It removes ugliness of 02f90879e7, which stores distance values and null
flags in two separate arrays after GISTSearchItem struct. Instead we pack
both distance value and null flag in IndexOrderByDistance struct. Alignment
overhead should be negligible, because we typically deal with at most few
"col op const" expressions in ORDER BY clause.
* It fixes handling of "col op NULL" expression in KNN-SP-GiST. Now, these
expression are not passed to support functions, which can't deal with them.
Instead, NULL result is implicitly assumed. It future we may decide to
teach support functions to deal with NULL arguments, but current solution is
bugfix suitable for backpatch.
Tom Lane [Thu, 19 Sep 2019 15:22:21 +0000 (11:22 -0400)]
Doc: improve documentation around jsonpath regular expressions.
Provide some documentation about the differences between XQuery
regular expressions and those supported by Spencer's regex engine.
Since SQL now exposes XQuery regexps with the LIKE_REGEX operator,
I made this a standalone section designed to help somebody who
has to translate a LIKE_REGEX query to Postgres. (Eventually we might
extend Spencer's engine to allow precise implementation of XQuery,
but not today.)
Reference that in the jsonpath docs, provide definitions of the
XQuery flag letters, and add a description of the JavaScript-inspired
string literal syntax used within jsonpath. Also point out explicitly
that backslashes used within like_regex patterns will need to be doubled.
This also syncs the docs with the decision implemented in commit d5b90cd64 to desupport XQuery's 'x' flag for now.
Peter Eisentraut [Thu, 19 Sep 2019 13:03:23 +0000 (15:03 +0200)]
GSSAPI error message improvements
Make the error messages around GSSAPI encryption a bit clearer. Tweak
some messages to avoid plural problems.
Also make a code change for clarity. Using "conf" for "confidential"
is quite confusing. Using "conf_state" is perhaps not much better but
that's what the GSSAPI documentation uses, so there is at least some
hope of understanding it.
Reported-by: Filip Rembiałkowski
Author: Filip Rembiałkowski
Backpatch-through: 12, where it was introduced
Discussion: https://postgr.es/m/CAP_rwwmSNy1=_82rwGe3-X4PjWqPSFXtzNf43DCtGzD7SazdXA@mail.gmail.com
Peter Eisentraut [Thu, 19 Sep 2019 07:02:41 +0000 (09:02 +0200)]
Revert change of ecpglib major version
The major version of ecpglib was changed in bd7c95f0c1a38becffceb3ea7234d57167f6d4bf, apparently without
justification. Revert this, since nothing has changed in this library
except some added functions.
Michael Paquier [Thu, 19 Sep 2019 04:19:00 +0000 (13:19 +0900)]
Doc: Fix incorrect mention to connection_object in CONNECT command of ECPG
This fixes an inconsistency with this parameter name not listed in the
command synopsis, and connection_name is the parameter name more
commonly used in the docs for ECPG commands.
Amit Kapila [Thu, 19 Sep 2019 02:32:12 +0000 (08:02 +0530)]
Doc: document autovacuum interruption.
It's important users be able to know (without looking at the source code)
that running DDL or DDL-like commands can interrupt autovacuum which can
lead to a lot of dead tuples and hence slower database operations.
Reported-by: James Coleman
Author: James Coleman Reviewed-by: Amit Kapila
Backpatch-through: 9.4
Discussion: https://postgr.es/m/CAAaqYe-XYyNwML1=f=gnd0qWg46PnvD=BDrCZ5-L94B887XVxQ@mail.gmail.com
Doc: Update FDW documentation about direct foreign table modification.
1. Commit 7086be6e3 should have documented the limitation that the direct
modification is disabled when WCO constraints are present, but didn't,
which is definitely my fault. Update the documentation (Postgres 9.6
onwards).
2. Commit fc22b6623 should have documented the limitation that the direct
modification is disabled when generated columns are defined, but
didn't. Update the documentation (Postgres 12 onwards).
Tom Lane [Tue, 17 Sep 2019 19:39:51 +0000 (15:39 -0400)]
Fix bogus handling of XQuery regex option flags.
The SQL spec defers to XQuery to define what the option flags are
for LIKE_REGEX patterns. XQuery says that:
* 's' allows the dot character to match newlines, which by
default it will not;
* 'm' allows ^ and $ to match at newlines, not only at the
start/end of the whole string.
Thus, these are *not* inverses as they are for the similarly-named
POSIX options, and neither one corresponds to the POSIX 'n' option.
Fortunately, Spencer's library does expose these two behaviors as
separately twiddlable flags, so we just have to fix the mapping from
JSP flag bits to REG flag bits. I also chose to rename the symbol
for 's' to DOTALL, to make it clearer that it's not the inverse
of MLINE.
Also, XQuery says that if the 'q' flag "is used together with the m, s,
or x flag, that flag has no effect". I read this as saying that 'q'
overrides the other flags; whoever wrote our code seems to have read
it backwards.
Lastly, while XQuery's 'x' flag is related to what Spencer's code
does for REG_EXPANDED, it's not the same or a subset. It seems best
to treat XQuery's 'x' as unimplemented for now. Maybe later we can
expand our regex code to offer 'x'-style parsing as a separate option.
While at it, refactor the jsonpath code so that (a) there's only
one copy of the flag transformation logic not two, and (b) the
processing of flags is independent of the order in which the flags
are written.
We need some documentation updates to go with this, but I'll
tackle that separately.
Dean Rasheed [Sun, 15 Sep 2019 12:16:59 +0000 (13:16 +0100)]
Fix intermittent self-test failures caused by the stats_ext test.
Commit d7f8d26d9 added new tests to the stats_ext regression test that
included creating a view in the public schema, without realising that
the stats_ext test runs in the same parallel group as the rules test,
which makes doing that unsafe.
This led to intermittent failures of the rules test on the buildfarm,
although I wasn't able to reproduce that locally. Fix by creating the
view in a different schema.
Tomas Vondra and Dean Rasheed, report and diagnosis by Thomas Munro.
PostgreSQL has been unusable when built with xlc 13 and newer, which are
incompatible with our use of __fetch_and_add(). Back-patch to 9.5,
which introduced pg_atomic_fetch_add_u32().
logical decoding: process ASSIGNMENT during snapshot build
Most WAL records are ignored in early SnapBuild snapshot build phases.
But it's critical to process some of them, so that later messages have
the correct transaction state after the snapshot is completely built; in
particular, XLOG_XACT_ASSIGNMENT messages are critical in order for
sub-transactions to be correctly assigned to their parent transactions,
or at least one assert misbehaves, as reported by Ildar Musin.
Lack of parens in the definitions could cause a statement using these
macros to have unexpected semantics. In current code no bug is
apparent, but best to fix the definitions to avoid problems down the
line.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/19795.1568400476@sss.pgh.pa.us
The progress state was being clobbered once the first index completed
being rebuilt, causing the final phases of the operation not show
anything in the progress view. This was inadvertently broken in 03f9e5cba0ee, which added progress tracking for REINDEX.
(The reason this bugfix is this small is that I had already noticed this
problem when writing monitoring for CREATE INDEX, and had already worked
around it, as can be seen in discussion starting at
https://postgr.es/m/20190329150218.GA25010@alvherre.pgsql Fixing the
problem is just a matter of fixing one place touched by the REINDEX
monitoring.)
Peter Geoghegan [Thu, 12 Sep 2019 22:45:07 +0000 (15:45 -0700)]
Fix nbtree page split rmgr desc routine.
Include newitemoff in rmgr desc output for nbtree page split records.
In passing, correct an obsolete comment that claimed that newitemoff is
only logged for _L variant nbtree page split WAL records.
Both issues were oversights in commit 2c03216d831, which revamped the
WAL format.
Author: Peter Geoghegan
Backpatch: 9.5-, where the WAL format was revamped.
Tom Lane [Thu, 12 Sep 2019 22:29:17 +0000 (18:29 -0400)]
Fix usage of whole-row variables in WCO and RLS policy expressions.
Since WITH CHECK OPTION was introduced, ExecInitModifyTable has
initialized WCO expressions with the wrong plan node as parent -- that is,
it passed its input subplan not the ModifyTable node itself. Up to now
we thought this was harmless, but bug #16006 from Vinay Banakar shows it's
not: if the input node is a SubqueryScan then ExecInitWholeRowVar can get
confused into doing the wrong thing. (The fact that ExecInitWholeRowVar
contains such logic is certainly a horrid kluge that doesn't deserve to
live, but figuring out another way to do that is a task for some other day.)
Andres had already noticed the wrong-parent mistake and fixed it in commit 148e632c0, but not being aware of any user-visible consequences, he quite
reasonably didn't back-patch. This patch is simply a back-patch of 148e632c0, plus addition of a test case based on bug #16006. I also added
the test case to v12/HEAD, even though the bug is already fixed there.
Back-patch to all supported branches. 9.4 lacks RLS policies so the
new test case doesn't work there, but I'm pretty sure a test could be
devised based on using a whole-row Var in a plain WITH CHECK OPTION
condition. (I lack the cycles to do so myself, though.)
Amit Kapila [Wed, 11 Sep 2019 04:55:49 +0000 (10:25 +0530)]
Doc: Update PL/pgSQL sample function in plpgsql.sgml.
The example used to explain 'Looping Through Query Results' uses
pseudo-materialized views. Replace it with a more up-to-date example
which does the same thing with actual materialized views, which have
been available since PostgreSQL 9.3.
In the passing, change '%' as format specifier instead of '%s' as is used
in other examples in plpgsql.sgml.
Reported-by: Ian Barwick
Author: Ian Barwick Reviewed-by: Amit Kapila
Backpatch-through: 9.4
Discussion: https://postgr.es/m/9a70d393-7904-4918-c97c-649f6d114b6a@2ndquadrant.com
Michael Paquier [Wed, 11 Sep 2019 02:07:25 +0000 (11:07 +0900)]
Expand properly list of TAP tests used for prove in vcregress.pl
Depending on the system used, t/*.pl may not be expanded into a list of
tests which can be consumed by prove when attempting to run TAP tests on
a given path. Fix that by using glob() directly in the script, to make
sure that a complete list of tests is provided. This has not proved to
be an issue with MSVC as the list was properly expanded, but it is on
Linux with perl's system().
This is extracted from a larger patch.
Author: Tom Lane
Discussion: https://postgr.es/m/6628.1567958876@sss.pgh.pa.us
Backpatch-through: 9.4
Tom Lane [Tue, 10 Sep 2019 16:04:57 +0000 (12:04 -0400)]
Don't drop NOTICE messages in isolation tests.
For its entire existence, isolationtester.c has forced client_min_messages
to WARNING, but that seems like a very poor choice of test design. It
should be up to individual test scripts to manage whether they emit notices
and to ensure that the results are stable. (There were no NOTICE messages
in the original set of isolation tests, so this was certainly dead code
when committed, but perhaps it was needed at some earlier point.)
It's possible that the original motivation was due to platform-dependent
variations in the timing of stdout vs. stderr output. That should be
moot since commits 73bcb76b7/6eda3e9c2, but just in case, adjust
isotesterNoticeProcessor to print to stdout not stderr. (stderr seems
like the wrong thing anyway: it should be for error printouts not expected
test output.)
Back-patch of commit ebd499282 into v12. I'll separately push this
into older branches, but this is as much change as v12 needs.
Tom Lane [Tue, 10 Sep 2019 02:53:51 +0000 (22:53 -0400)]
Fix isolationtester race condition for notices sent before blocking.
If a test sends a notice just before blocking, it's possible on
slow machines for isolationtester to detect the blocked state before
it's consumed the notice. (For this to happen, the notice would have
to arrive after isolationtester has waited for data for 10ms, so on
fast/lightly-loaded machines it's hard to reproduce the failure.)
But, if we have seen the backend as blocked, it's certainly already
sent any notices it's going to send. Therefore, one more round of
PQconsumeInput and PQisBusy should be enough to collect and process
any such notices.
Back-patch of 30717637c into v12. We're still discussing whether
to back-patch this further and/or back-patch some other recent
isolationtester fixes, but this much is provably necessary to
make the test cases added by 27cc7cd2b stable in v12.
Tom Lane [Mon, 9 Sep 2019 18:21:40 +0000 (14:21 -0400)]
Be more careful about port selection in src/test/ldap/.
Don't just assume that the next port is free; it might not be, or
if we're really unlucky it might even be out of the TCP range.
Do it honestly with two get_free_port() calls instead.
This is surely a pretty low-probability problem, but I think it
explains a buildfarm failure seen today, so let's fix it.
Andrew Dunstan [Mon, 9 Sep 2019 12:56:33 +0000 (08:56 -0400)]
Prevent msys2 conversion of "cmd /c" switch to a file path
Modern versions of msys2 have changed the treatment of "cmd /c" so that
the runtime will try to convert the switch to a native file path. This
patch adds a setting to inhibit that behaviour.
Andres Freund [Mon, 9 Sep 2019 12:21:30 +0000 (05:21 -0700)]
Reorder EPQ work, to fix rowmark related bugs and improve efficiency.
In ad0bda5d24ea I changed the EvalPlanQual machinery to store
substitution tuples in slot, instead of using plain HeapTuples. The
main motivation for that was that using HeapTuples will be inefficient
for future tableams. But it turns out that that conversion was buggy
for non-locking rowmarks - the wrong tuple descriptor was used to
create the slot.
As a secondary issue 5db6df0c0 changed ExecLockRows() to begin EPQ
earlier, to allow to fetch the locked rows directly into the EPQ
slots, instead of having to copy tuples around. Unfortunately, as Tom
complained, that forces some expensive initialization to happen
earlier.
As a third issue, the test coverage for EPQ was clearly insufficient.
Fixing the first issue is unfortunately not trivial: Non-locked row
marks were fetched at the start of EPQ, and we don't have the type
information for the rowmarks available at that point. While we could
change that, it's not easy. It might be worthwhile to change that at
some point, but to fix this bug, it seems better to delay fetching
non-locking rowmarks when they're actually needed, rather than
eagerly. They're referenced at most once, and in cases where EPQ
fails, might never be referenced. Fetching them when needed also
increases locality a bit.
To be able to fetch rowmarks during execution, rather than
initialization, we need to be able to access the active EPQState, as
that contains necessary data. To do so move EPQ related data from
EState to EPQState, and, only for EStates creates as part of EPQ,
reference the associated EPQState from EState.
To fix the second issue, change EPQ initialization to allow use of
EvalPlanQualSlot() to be used before EvalPlanQualBegin() (but
obviously still requiring EvalPlanQualInit() to have been done).
As these changes made struct EState harder to understand, e.g. by
adding multiple EStates, significantly reorder the members, and add a
lot more comments.
Also add a few more EPQ tests, including one that fails for the first
issue above. More is needed.
Reported-By: yi huang
Author: Andres Freund Reviewed-By: Tom Lane
Discussion:
https://postgr.es/m/CAHU7rYZo_C4ULsAx_LAj8az9zqgrD8WDd4hTegDTMM1LMqrBsg@mail.gmail.com
https://postgr.es/m/24530.1562686693@sss.pgh.pa.us
Backpatch: 12-, where the EPQ changes were introduced
Fix handling of non-key columns get_index_column_opclass()
f2e40380 introduces support of non-key attributes in GiST indexes. Then if
get_index_column_opclass() is asked by gistproperty() to get an opclass of
non-key column, it returns garbage past oidvector value. This commit fixes
that by making get_index_column_opclass() return InvalidOid in this case.
Discussion: https://postgr.es/m/20190902231948.GA5343%40alvherre.pgsql
Author: Nikita Glukhov, Alexander Korotkov
Backpatch-through: 12
Tom Lane [Sun, 8 Sep 2019 21:00:29 +0000 (17:00 -0400)]
Fix RelationIdGetRelation calls that weren't bothering with error checks.
Some of these are quite old, but that doesn't make them not bugs.
We'd rather report a failure via elog than SIGSEGV.
While at it, uniformly spell the error check as !RelationIsValid(rel)
rather than a bare rel == NULL test. The machine code is the same
but it seems better to be consistent.
Coverity complained about this today, not sure why, because the
mistake is in fact old.
In order to implement NULL LAST semantic GiST previously assumed distance to
the NULL value to be Inf. However, our distance functions can return Inf and
NaN for non-null values. In such cases, NULL LAST semantic appears to be
broken. This commit fixes that by introducing separate array of null flags for
distances.
Backpatch to all supported versions.
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 9.4
Fix handling Inf and Nan values in GiST pairing heap comparator
Previously plain float comparison was used in GiST pairing heap. Such
comparison doesn't provide proper ordering for value sets containing Inf and Nan
values. This commit fixes that by usage of float8_cmp_internal(). Note, there
is remaining problem with NULL distances, which are represented as Inf in
pairing heap. It would be fixes in subsequent commit.
Backpatch to all supported versions.
Reported-by: Andrey Borodin
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov Reviewed-by: Heikki Linnakangas
Backpatch-through: 9.4
Fix behavior of AND CHAIN outside of explicit transaction blocks
When using COMMIT AND CHAIN or ROLLBACK AND CHAIN not in an explicit
transaction block, the previous implementation would leave a
transaction block active in the ROLLBACK case but not the COMMIT case.
To fix for now, error out when using these commands not in an explicit
transaction block. This restriction could be lifted if a sensible
definition and implementation is found.
Tom Lane [Sat, 7 Sep 2019 23:03:11 +0000 (19:03 -0400)]
Avoid using INFO elevel for what are fundamentally debug messages.
Commit 6f6b99d13 stuck an INFO message into the fast path for
checking partition constraints, for no very good reason except
that it made it easy for the regression tests to verify that
that path was taken. Assorted later patches did likewise,
increasing the unsuppressable-chatter level from ALTER TABLE
even more. This isn't good for the user experience, so let's
drop these messages down to DEBUG1 where they belong. So as
not to have a loss of test coverage, create a TAP test that
runs the relevant queries with client_min_messages = DEBUG1
and greps for the expected messages.
This testing method is a bit brute-force --- in particular,
it duplicates the execution of a fair amount of the core
create_table and alter_table tests. We experimented with
other solutions, but running any significant amount of
standard testing with client_min_messages = DEBUG1 seems
to have a lot of output-stability pitfalls, cf commits bbb96c370 and 5655565c0. Possibly at some point we'll look
into whether we can reduce the amount of test duplication.
Backpatch into v12, because some of these messages are new
in v12 and we don't really want to ship it that way.
Tom Lane [Sat, 7 Sep 2019 15:21:24 +0000 (11:21 -0400)]
Doc: make an editing pass over the v12 release notes.
Copy-edit or rewrite some items, add some items that I thought
deserved documenting, remove some others that didn't (notably,
I'm not on board with documenting documentation changes), fix
some poorly-pointed links, move some items to other sections,
etc etc.
Robert Haas [Fri, 6 Sep 2019 12:22:32 +0000 (08:22 -0400)]
When performing a base backup, check for read errors.
The old code didn't differentiate between a read error and a
concurrent truncation. fread reports both of these by returning 0;
you have to use feof() or ferror() to distinguish between them,
which this code did not do.
It might be a better idea to use read() rather than fread() here,
so that we can display a less-generic error message, but I'm not
sure that would qualify as a back-patchable bug fix, so just do
this much for now.
Jeevan Chalke, reviewed by Jeevan Ladhe and by me.
Make pg_promote() detect postmaster death while waiting for promotion to end.
Previously even if postmaster died and WaitLatch() woke up with that event
while pg_promote() was waiting for the standby promotion to finish,
pg_promote() did nothing special and kept waiting until timeout occurred.
This could cause a busy loop.
This patch make pg_promote() return false immediately when postmaster
dies, to avoid such a busy loop.
Back-patch to v12 where pg_promote() was added.
Author: Fujii Masao Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEs9ROgSp+QF+YdDU+xP8W=CY1k-_Ov-d_Z3JY+to3eXA@mail.gmail.com
Michael Paquier [Wed, 4 Sep 2019 06:46:45 +0000 (15:46 +0900)]
Fix thinko when ending progress report for a backend
The logic ending progress reporting for a backend entry introduced by b6fb647 causes callers of pgstat_progress_end_command() to do some extra
work when track_activities is enabled as the process fields are reset in
the backend entry even if no command were started for reporting.
This resets the fields only if a command is registered for progress
reporting, and only if track_activities is enabled.
Michael Paquier [Wed, 4 Sep 2019 04:23:41 +0000 (13:23 +0900)]
Delay fsyncs of pg_basebackup until the end of backup
Since the addition of fsync requests in bc34223 to make base backup data
consistent on disk once pg_basebackup finishes, each tablespace tar file
is individually flushed once completed, with an additional flush of the
parent directory when the base backup finishes. While holding a
connection to the server, a fsync request taking a long time may cause a
failure of the base backup, which is annoying for any integration. A
recent example of breakage can involve tcp_user_timeout, but
wal_sender_timeout can cause similar problems.
While reviewing the code, there was a second issue causing too many
fsync requests to be done for the same WAL data. As recursive fsyncs
are done at the end of the backup for both the plain and tar formats
from the base target directory where everything is written, it is fine
to disable fsyncs when fetching or streaming WAL.
Clarify in the help output and documentation that -n, -t etc. take a
"pattern" rather than a "schema" or "table" etc. This was especially
confusing now that the new pg_dumpall --exclude-database option was
documented with "pattern" and the others not, even though they all
behave the same.
Michael Paquier [Tue, 3 Sep 2019 03:31:03 +0000 (12:31 +0900)]
Fix memory leak with lower, upper and initcap with ICU-provided collations
The leak happens in str_tolower, str_toupper and str_initcap, which are
used in several places including their equivalent SQL-level functions,
and can only be triggered when using an ICU-provided collation when
converting the input string.
b615920 fixed a similar leak. Backpatch down 10 where ICU collations
have been introduced.
Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/94c0ad0a-cbc2-e4a3-7829-2bdeaf9146db@postgrespro.ru
Backpatch-through: 10
Tom Lane [Mon, 2 Sep 2019 20:10:37 +0000 (16:10 -0400)]
Avoid touching replica identity index in ExtractReplicaIdentity().
In what seems like a fit of misplaced optimization,
ExtractReplicaIdentity() accessed the relation's replica-identity
index without taking any lock on it. Usually, the surrounding query
already holds some lock so this is safe enough ... but in the case
of a previously-planned delete, there might be no existing lock.
Given a suitable test case, this is exposed in v12 and HEAD by an
assertion added by commit b04aeb0a0.
The whole thing's rather poorly thought out anyway; rather than
looking directly at the index, we should use the index-attributes
bitmap that's held by the parent table's relcache entry, as the
caller functions do. This is more consistent and likely a bit
faster, since it avoids a cache lookup. Hence, change to doing it
that way.
While at it, rather than blithely assuming that the identity
columns are non-null (with catastrophic results if that's wrong),
add assertion checks that they aren't null. Possibly those should
be actual test-and-elog, but I'll leave it like this for now.
In principle, this is a bug that's been there since this code was
introduced (in 9.4). In practice, the risk seems quite low, since
we do have a lock on the index's parent table, so concurrent
changes to the index's catalog entries seem unlikely. Given the
precedent that commit 9c703c169 wasn't back-patched, I won't risk
back-patching this further than v12.
Tom Lane [Mon, 2 Sep 2019 18:02:45 +0000 (14:02 -0400)]
Handle corner cases correctly in psql's reconnection logic.
After an unexpected connection loss and successful reconnection,
psql neglected to resynchronize its internal state about the server,
such as server version. Ordinarily we'd be reconnecting to the same
server and so this isn't really necessary, but there are scenarios
where we do need to update --- one example is where we have a list
of possible connection targets and they're not all alike.
Define "resynchronize" as including connection_warnings(), so that
this case acts the same as \connect. This seems useful; for example,
if the server version did change, the user might wish to know that.
An attuned user might also notice that the new connection isn't
SSL-encrypted, for example, though this approach isn't especially
in-your-face about such changes. Although this part is a behavioral
change, it only affects interactive sessions, so it should not break
any applications.
Also, in do_connect, make sure that we desynchronize correctly when
abandoning an old connection in non-interactive mode.
These problems evidently are the result of people patching only one
of the two places where psql deals with connection changes, so insert
some cross-referencing comments in hopes of forestalling future bugs
of the same ilk.
Lastly, in Windows builds, issue codepage mismatch warnings only at
startup, not during reconnections. psql's codepage can't change
during a reconnect, so complaining about it again seems like useless
noise.
Peter Billen and Tom Lane. Back-patch to all supported branches.
Tom Lane [Fri, 30 Aug 2019 17:02:35 +0000 (13:02 -0400)]
Doc: remove some long-obsolete information from installation.sgml.
Section 16.2 pointed to platform-specific FAQ files that we removed
way back in 8.4. Section 16.7 contained a bunch of information about
AIX and HPUX bugs that were squashed decades ago, plus discussions of
old compiler versions that are certainly moot now that we require C99
support. Since we're obviously not maintaining this stuff carefully,
just remove it. The HPUX sub-section seems like it can go away
entirely, since everything it said that was still applicable was
redundant with material elsewhere in the chapter.
In passing, I couldn't resist the temptation to do a small amount
of copy-editing on nearby text.
Back-patch to v12, since this stuff is surely obsolete in any
branch that requires C99.
Fix overflow check and comment in GIN posting list encoding.
The comment did not match what the code actually did for integers with
the 43rd bit set. You get an integer like that, if you have a posting
list with two adjacent TIDs that are more than 2^31 blocks apart.
According to the comment, we would store that in 6 bytes, with no
continuation bit on the 6th byte, but in reality, the code encodes it
using 7 bytes, with a continuation bit on the 6th byte as normal.
The decoding routine also handled these 7-byte integers correctly, except
for an overflow check that assumed that one integer needs at most 6 bytes.
Fix the overflow check, and fix the comment to match what the code
actually does. Also fix the comment that claimed that there are 17 unused
bits in the 64-bit representation of an item pointer. In reality, there
are 64-32-11=21.
Fitting any item pointer into max 6 bytes was an important property when
this was written, because in the old pre-9.4 format, item pointers were
stored as plain arrays, with 6 bytes for every item pointer. The maximum
of 6 bytes per integer in the new format guaranteed that we could convert
any page from the old format to the new format after upgrade, so that the
new format was never larger than the old format. But we hardly need to
worry about that anymore, and running into that problem during upgrade,
where an item pointer is expanded from 6 to 7 bytes such that the data
doesn't fit on a page anymore, is implausible in practice anyway.
Backpatch to all supported versions.
This also includes a little test module to test these large distances
between item pointers, without requiring a 16 TB table. It is not
backpatched, I'm including it more for the benefit of future development
of new posting list formats.
Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi Reviewed-by: Masahiko Sawada, Alexander Korotkov
Thomas Munro [Wed, 28 Aug 2019 01:37:03 +0000 (13:37 +1200)]
Avoid catalog lookups in RelationAllowsEarlyPruning().
RelationAllowsEarlyPruning() performed a catalog scan, but is used
in two contexts where that was a bad idea:
1. In heap_page_prune_opt(), which runs very frequently in some large
scans. This caused major performance problems in a field report
that was easy to reproduce.
2. In TestForOldSnapshot(), which runs while we hold a buffer content
lock. It's not clear if this was guaranteed to be free of buffer
deadlock risk.
The check was introduced in commit 2cc41acd8 and defended against a
real problem: 9.6's hash indexes have no page LSN and so we can't
allow early pruning (ie the snapshot-too-old feature). We can remove
the check from all later releases though: hash indexes are now logged,
and there is no way to create UNLOGGED indexes on regular logged
tables.
If a future release allows such a combination, it might need to put
a similar check in place, but it'll need some more thought.
Back-patch to 10.
Author: Thomas Munro Reviewed-by: Tom Lane, who spotted the second problem
Discussion: https://postgr.es/m/CA%2BhUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD%2Bg%40mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1%2BWy%2BN4eE5zPm765h68LrkWc3Biu_8rzzi%2BOYX4j%2BiHRw%40mail.gmail.com
Michael Paquier [Wed, 28 Aug 2019 02:48:15 +0000 (11:48 +0900)]
Disable timeouts when running pg_rewind with online source cluster
In this case, the transfer uses a libpq connection, which is subject to
the timeout parameters set at system level, and this can make the rewind
operation suddenly canceled which is not good for automation. One
workaround to such issues would be to use PGOPTIONS to enforce the
wanted timeout parameters, but that's annoying, and for example pg_dump,
which can run potentially long-running queries disables all types of
timeouts.
lock_timeout and statement_timeout are the ones which can cause problems
now. Note that pg_rewind does not use transactions, so disabling
idle_in_transaction_session_timeout is optional, but it feels safer to
do so for the future.
This is back-patched down to 9.5. idle_in_transaction_session_timeout
is only present since 9.6.
Author: Alexander Kukushkin
Discussion: https://postgr.es/m/CAFh8B=krcVXksxiwVQh1SoY+ziJ-JC=6FcuoBL3yce_40Es5_g@mail.gmail.com
Backpatch-through: 9.5
Tom Lane [Tue, 27 Aug 2019 21:24:13 +0000 (17:24 -0400)]
Improve what pg_strsignal prints if we haven't got strsignal(3).
Turns out that returning "unrecognized signal" is confusing.
Make it explicit that the platform lacks any support for signal names.
(At least of the machines in the buildfarm, only HPUX lacks it.)
Back-patch to v12 where we invented this function.
Tom Lane [Tue, 27 Aug 2019 20:37:22 +0000 (16:37 -0400)]
Doc: clarify behavior of standard aggregates for null inputs.
Section 4.2.7 says that unless otherwise specified, built-in
aggregates ignore rows in which any input is null. This is
not true of the JSON aggregates, but it wasn't documented.
Fix that.
Of the other entries in table 9.55, some were explicit about
ignoring nulls, and some weren't; for consistency and
self-contained-ness, make them all say it explicitly.
Per bug #15884 from Tim Möhlmann. Back-patch to all supported
branches.
Tom Lane [Tue, 27 Aug 2019 18:44:26 +0000 (14:44 -0400)]
Reject empty names and recursion in config-file include directives.
An empty file name or subdirectory name leads join_path_components() to
just produce the parent directory name, which leads to weird failures or
recursive inclusions. Let's throw a specific error for that. It takes
only slightly more code to detect all-blank names, so do so.
Also, detect direct recursion, ie a file calling itself. As coded
this will also detect recursion via "include_dir '.'", which is
perhaps more likely than explicitly including the file itself.
Detecting indirect recursion would require API changes for guc-file.l
functions, which seems not worth it since extensions might call them.
The nesting depth limit will catch such cases eventually, just not
with such an on-point error message.
In passing, adjust the example usages in postgresql.conf.sample
to perhaps eliminate the problem at the source: there's no reason
for the examples to suggest that an empty value is valid.
Per a trouble report from Brent Bates. Back-patch to 9.5; the
issue is old, but the code in 9.4 is enough different that the
patch doesn't apply easily, and it doesn't seem worth the trouble
to fix there.
Michael Paquier [Tue, 27 Aug 2019 00:11:38 +0000 (09:11 +0900)]
Fix failure of --jobs with vacuumdb on Windows
FD_SETSIZE needs to be declared before winsock2.h, or it is possible to
run into buffer overflow issues when using --jobs. This is similar to
pgbench's solution done in a23c641.
This has been introduced by 71d84ef, and older versions have been using
the default value of FD_SETSIZE, defined at 64.
Per buildfarm member jacana, but this impacts all Windows animals
running the TAP tests. I have reproduced the failure locally to check
the patch.
Author: Michael Paquier Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190826054000.GE7005@paquier.xyz
Backpatch-through: 9.5
Andrew Dunstan [Mon, 26 Aug 2019 12:11:27 +0000 (08:11 -0400)]
Adjust to latest Msys2 kernel release number
Previously 'uname -r' on Msys2 reported a kernele release starting with
2. The latest version starts with 3. In commit 1638623f we specifically
looked for one starting with 2. This is now changed to look for any
digit between 2 and 9.
Andrew Dunstan [Mon, 26 Aug 2019 11:44:34 +0000 (07:44 -0400)]
Treat MINGW and MSYS the same in pg_upgrade test script
On msys2, 'uname -s' reports a string starting MSYS instead on MINGW
as happens on msys1. Treat these both the same way. This reverts 608a710195a4b in favor of a more general solution.
Michael Paquier [Mon, 26 Aug 2019 02:14:24 +0000 (11:14 +0900)]
Fix error handling of vacuumdb when running out of fds
When trying to use a high number of jobs, vacuumdb has only checked for
a maximum number of jobs used, causing confusing failures when running
out of file descriptors when the jobs open connections to Postgres.
This commit changes the error handling so as we do not check anymore for
a maximum number of allowed jobs when parsing the option value with
FD_SETSIZE, but check instead if a file descriptor is within the
supported range when opening the connections for the jobs so as this is
detected at the earliest time possible.
Also, improve the error message to give a hint about the number of jobs
recommended, using a wording given by the reviewers of the patch.
Reported-by: Andres Freund
Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de
Backpatch-through: 9.5
Tom Lane [Sun, 25 Aug 2019 19:04:04 +0000 (15:04 -0400)]
Avoid platform-specific null pointer dereference in psql.
POSIX permits getopt() to advance optind beyond argc when the last
argv entry is an option that requires an argument and hasn't got one.
It seems that no major platforms actually do that, but musl does,
so that something like "psql -f" would crash with that libc.
Add a check that optind is in range before trying to look at the
possibly-bogus option.
Report and fix by Quentin Rameau. Back-patch to all supported
branches.
Tom Lane [Sun, 25 Aug 2019 16:14:50 +0000 (12:14 -0400)]
Back off output precision in circle.sql regression test.
We were setting extra_float_digits = 0 to avoid platform-dependent
output in this test, but that's still able to expose platform-specific
roundoff behavior in some new test cases added by commit a3d284485,
as reported by Peter Eisentraut. Reduce it to -1 to hide that.
(Over in geometry.sql, we're using -3, which is an ancient decision
dating to 337f73b1b. I wonder whether that's overkill now. But
there's probably little value in trying to change it.)
Back-patch to v12 where a3d284485 came in; there's no evidence that
we have any platform-dependent issues here before that.
Thomas Munro [Sun, 25 Aug 2019 01:54:48 +0000 (13:54 +1200)]
Don't rely on llvm::make_unique.
Bleeding-edge LLVM has stopped supplying replacements for various
C++14 library features, for people on older C++ versions. Since we're
not ready to require C++14 yet, just use plain old new instead of
make_unique. As revealed by buildfarm animal seawasp.
Back-patch to 11.
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGJWG7unNqmkxg7nC5o3o-0p2XP6co4r%3D9epqYMm8UY4Mw%40mail.gmail.com
Michael Paquier [Fri, 23 Aug 2019 11:41:13 +0000 (20:41 +0900)]
Improve documentation of pageinspect
This adds a section for heap-related functions. These were previously
mixed with functions having a more general purpose, leading to
confusion. While on it, add a query example for fsm_page_contents.
Backpatch down to 10, where b5e3942 introduced the subsections for
function types in pageinspect documentation.