Michael Paquier [Tue, 3 Sep 2019 03:31:03 +0000 (12:31 +0900)]
Fix memory leak with lower, upper and initcap with ICU-provided collations
The leak happens in str_tolower, str_toupper and str_initcap, which are
used in several places including their equivalent SQL-level functions,
and can only be triggered when using an ICU-provided collation when
converting the input string.
b615920 fixed a similar leak. Backpatch down 10 where ICU collations
have been introduced.
Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/94c0ad0a-cbc2-e4a3-7829-2bdeaf9146db@postgrespro.ru
Backpatch-through: 10
Tom Lane [Mon, 2 Sep 2019 20:10:37 +0000 (16:10 -0400)]
Avoid touching replica identity index in ExtractReplicaIdentity().
In what seems like a fit of misplaced optimization,
ExtractReplicaIdentity() accessed the relation's replica-identity
index without taking any lock on it. Usually, the surrounding query
already holds some lock so this is safe enough ... but in the case
of a previously-planned delete, there might be no existing lock.
Given a suitable test case, this is exposed in v12 and HEAD by an
assertion added by commit b04aeb0a0.
The whole thing's rather poorly thought out anyway; rather than
looking directly at the index, we should use the index-attributes
bitmap that's held by the parent table's relcache entry, as the
caller functions do. This is more consistent and likely a bit
faster, since it avoids a cache lookup. Hence, change to doing it
that way.
While at it, rather than blithely assuming that the identity
columns are non-null (with catastrophic results if that's wrong),
add assertion checks that they aren't null. Possibly those should
be actual test-and-elog, but I'll leave it like this for now.
In principle, this is a bug that's been there since this code was
introduced (in 9.4). In practice, the risk seems quite low, since
we do have a lock on the index's parent table, so concurrent
changes to the index's catalog entries seem unlikely. Given the
precedent that commit 9c703c169 wasn't back-patched, I won't risk
back-patching this further than v12.
Tom Lane [Mon, 2 Sep 2019 18:02:45 +0000 (14:02 -0400)]
Handle corner cases correctly in psql's reconnection logic.
After an unexpected connection loss and successful reconnection,
psql neglected to resynchronize its internal state about the server,
such as server version. Ordinarily we'd be reconnecting to the same
server and so this isn't really necessary, but there are scenarios
where we do need to update --- one example is where we have a list
of possible connection targets and they're not all alike.
Define "resynchronize" as including connection_warnings(), so that
this case acts the same as \connect. This seems useful; for example,
if the server version did change, the user might wish to know that.
An attuned user might also notice that the new connection isn't
SSL-encrypted, for example, though this approach isn't especially
in-your-face about such changes. Although this part is a behavioral
change, it only affects interactive sessions, so it should not break
any applications.
Also, in do_connect, make sure that we desynchronize correctly when
abandoning an old connection in non-interactive mode.
These problems evidently are the result of people patching only one
of the two places where psql deals with connection changes, so insert
some cross-referencing comments in hopes of forestalling future bugs
of the same ilk.
Lastly, in Windows builds, issue codepage mismatch warnings only at
startup, not during reconnections. psql's codepage can't change
during a reconnect, so complaining about it again seems like useless
noise.
Peter Billen and Tom Lane. Back-patch to all supported branches.
Tom Lane [Fri, 30 Aug 2019 17:02:35 +0000 (13:02 -0400)]
Doc: remove some long-obsolete information from installation.sgml.
Section 16.2 pointed to platform-specific FAQ files that we removed
way back in 8.4. Section 16.7 contained a bunch of information about
AIX and HPUX bugs that were squashed decades ago, plus discussions of
old compiler versions that are certainly moot now that we require C99
support. Since we're obviously not maintaining this stuff carefully,
just remove it. The HPUX sub-section seems like it can go away
entirely, since everything it said that was still applicable was
redundant with material elsewhere in the chapter.
In passing, I couldn't resist the temptation to do a small amount
of copy-editing on nearby text.
Back-patch to v12, since this stuff is surely obsolete in any
branch that requires C99.
Fix overflow check and comment in GIN posting list encoding.
The comment did not match what the code actually did for integers with
the 43rd bit set. You get an integer like that, if you have a posting
list with two adjacent TIDs that are more than 2^31 blocks apart.
According to the comment, we would store that in 6 bytes, with no
continuation bit on the 6th byte, but in reality, the code encodes it
using 7 bytes, with a continuation bit on the 6th byte as normal.
The decoding routine also handled these 7-byte integers correctly, except
for an overflow check that assumed that one integer needs at most 6 bytes.
Fix the overflow check, and fix the comment to match what the code
actually does. Also fix the comment that claimed that there are 17 unused
bits in the 64-bit representation of an item pointer. In reality, there
are 64-32-11=21.
Fitting any item pointer into max 6 bytes was an important property when
this was written, because in the old pre-9.4 format, item pointers were
stored as plain arrays, with 6 bytes for every item pointer. The maximum
of 6 bytes per integer in the new format guaranteed that we could convert
any page from the old format to the new format after upgrade, so that the
new format was never larger than the old format. But we hardly need to
worry about that anymore, and running into that problem during upgrade,
where an item pointer is expanded from 6 to 7 bytes such that the data
doesn't fit on a page anymore, is implausible in practice anyway.
Backpatch to all supported versions.
This also includes a little test module to test these large distances
between item pointers, without requiring a 16 TB table. It is not
backpatched, I'm including it more for the benefit of future development
of new posting list formats.
Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi Reviewed-by: Masahiko Sawada, Alexander Korotkov
Thomas Munro [Wed, 28 Aug 2019 01:37:03 +0000 (13:37 +1200)]
Avoid catalog lookups in RelationAllowsEarlyPruning().
RelationAllowsEarlyPruning() performed a catalog scan, but is used
in two contexts where that was a bad idea:
1. In heap_page_prune_opt(), which runs very frequently in some large
scans. This caused major performance problems in a field report
that was easy to reproduce.
2. In TestForOldSnapshot(), which runs while we hold a buffer content
lock. It's not clear if this was guaranteed to be free of buffer
deadlock risk.
The check was introduced in commit 2cc41acd8 and defended against a
real problem: 9.6's hash indexes have no page LSN and so we can't
allow early pruning (ie the snapshot-too-old feature). We can remove
the check from all later releases though: hash indexes are now logged,
and there is no way to create UNLOGGED indexes on regular logged
tables.
If a future release allows such a combination, it might need to put
a similar check in place, but it'll need some more thought.
Back-patch to 10.
Author: Thomas Munro Reviewed-by: Tom Lane, who spotted the second problem
Discussion: https://postgr.es/m/CA%2BhUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD%2Bg%40mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1%2BWy%2BN4eE5zPm765h68LrkWc3Biu_8rzzi%2BOYX4j%2BiHRw%40mail.gmail.com
Michael Paquier [Wed, 28 Aug 2019 02:48:15 +0000 (11:48 +0900)]
Disable timeouts when running pg_rewind with online source cluster
In this case, the transfer uses a libpq connection, which is subject to
the timeout parameters set at system level, and this can make the rewind
operation suddenly canceled which is not good for automation. One
workaround to such issues would be to use PGOPTIONS to enforce the
wanted timeout parameters, but that's annoying, and for example pg_dump,
which can run potentially long-running queries disables all types of
timeouts.
lock_timeout and statement_timeout are the ones which can cause problems
now. Note that pg_rewind does not use transactions, so disabling
idle_in_transaction_session_timeout is optional, but it feels safer to
do so for the future.
This is back-patched down to 9.5. idle_in_transaction_session_timeout
is only present since 9.6.
Author: Alexander Kukushkin
Discussion: https://postgr.es/m/CAFh8B=krcVXksxiwVQh1SoY+ziJ-JC=6FcuoBL3yce_40Es5_g@mail.gmail.com
Backpatch-through: 9.5
Tom Lane [Tue, 27 Aug 2019 21:24:13 +0000 (17:24 -0400)]
Improve what pg_strsignal prints if we haven't got strsignal(3).
Turns out that returning "unrecognized signal" is confusing.
Make it explicit that the platform lacks any support for signal names.
(At least of the machines in the buildfarm, only HPUX lacks it.)
Back-patch to v12 where we invented this function.
Tom Lane [Tue, 27 Aug 2019 20:37:22 +0000 (16:37 -0400)]
Doc: clarify behavior of standard aggregates for null inputs.
Section 4.2.7 says that unless otherwise specified, built-in
aggregates ignore rows in which any input is null. This is
not true of the JSON aggregates, but it wasn't documented.
Fix that.
Of the other entries in table 9.55, some were explicit about
ignoring nulls, and some weren't; for consistency and
self-contained-ness, make them all say it explicitly.
Per bug #15884 from Tim Möhlmann. Back-patch to all supported
branches.
Tom Lane [Tue, 27 Aug 2019 18:44:26 +0000 (14:44 -0400)]
Reject empty names and recursion in config-file include directives.
An empty file name or subdirectory name leads join_path_components() to
just produce the parent directory name, which leads to weird failures or
recursive inclusions. Let's throw a specific error for that. It takes
only slightly more code to detect all-blank names, so do so.
Also, detect direct recursion, ie a file calling itself. As coded
this will also detect recursion via "include_dir '.'", which is
perhaps more likely than explicitly including the file itself.
Detecting indirect recursion would require API changes for guc-file.l
functions, which seems not worth it since extensions might call them.
The nesting depth limit will catch such cases eventually, just not
with such an on-point error message.
In passing, adjust the example usages in postgresql.conf.sample
to perhaps eliminate the problem at the source: there's no reason
for the examples to suggest that an empty value is valid.
Per a trouble report from Brent Bates. Back-patch to 9.5; the
issue is old, but the code in 9.4 is enough different that the
patch doesn't apply easily, and it doesn't seem worth the trouble
to fix there.
Michael Paquier [Tue, 27 Aug 2019 00:11:38 +0000 (09:11 +0900)]
Fix failure of --jobs with vacuumdb on Windows
FD_SETSIZE needs to be declared before winsock2.h, or it is possible to
run into buffer overflow issues when using --jobs. This is similar to
pgbench's solution done in a23c641.
This has been introduced by 71d84ef, and older versions have been using
the default value of FD_SETSIZE, defined at 64.
Per buildfarm member jacana, but this impacts all Windows animals
running the TAP tests. I have reproduced the failure locally to check
the patch.
Author: Michael Paquier Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190826054000.GE7005@paquier.xyz
Backpatch-through: 9.5
Andrew Dunstan [Mon, 26 Aug 2019 12:11:27 +0000 (08:11 -0400)]
Adjust to latest Msys2 kernel release number
Previously 'uname -r' on Msys2 reported a kernele release starting with
2. The latest version starts with 3. In commit 1638623f we specifically
looked for one starting with 2. This is now changed to look for any
digit between 2 and 9.
Andrew Dunstan [Mon, 26 Aug 2019 11:44:34 +0000 (07:44 -0400)]
Treat MINGW and MSYS the same in pg_upgrade test script
On msys2, 'uname -s' reports a string starting MSYS instead on MINGW
as happens on msys1. Treat these both the same way. This reverts 608a710195a4b in favor of a more general solution.
Michael Paquier [Mon, 26 Aug 2019 02:14:24 +0000 (11:14 +0900)]
Fix error handling of vacuumdb when running out of fds
When trying to use a high number of jobs, vacuumdb has only checked for
a maximum number of jobs used, causing confusing failures when running
out of file descriptors when the jobs open connections to Postgres.
This commit changes the error handling so as we do not check anymore for
a maximum number of allowed jobs when parsing the option value with
FD_SETSIZE, but check instead if a file descriptor is within the
supported range when opening the connections for the jobs so as this is
detected at the earliest time possible.
Also, improve the error message to give a hint about the number of jobs
recommended, using a wording given by the reviewers of the patch.
Reported-by: Andres Freund
Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de
Backpatch-through: 9.5
Tom Lane [Sun, 25 Aug 2019 19:04:04 +0000 (15:04 -0400)]
Avoid platform-specific null pointer dereference in psql.
POSIX permits getopt() to advance optind beyond argc when the last
argv entry is an option that requires an argument and hasn't got one.
It seems that no major platforms actually do that, but musl does,
so that something like "psql -f" would crash with that libc.
Add a check that optind is in range before trying to look at the
possibly-bogus option.
Report and fix by Quentin Rameau. Back-patch to all supported
branches.
Tom Lane [Sun, 25 Aug 2019 16:14:50 +0000 (12:14 -0400)]
Back off output precision in circle.sql regression test.
We were setting extra_float_digits = 0 to avoid platform-dependent
output in this test, but that's still able to expose platform-specific
roundoff behavior in some new test cases added by commit a3d284485,
as reported by Peter Eisentraut. Reduce it to -1 to hide that.
(Over in geometry.sql, we're using -3, which is an ancient decision
dating to 337f73b1b. I wonder whether that's overkill now. But
there's probably little value in trying to change it.)
Back-patch to v12 where a3d284485 came in; there's no evidence that
we have any platform-dependent issues here before that.
Thomas Munro [Sun, 25 Aug 2019 01:54:48 +0000 (13:54 +1200)]
Don't rely on llvm::make_unique.
Bleeding-edge LLVM has stopped supplying replacements for various
C++14 library features, for people on older C++ versions. Since we're
not ready to require C++14 yet, just use plain old new instead of
make_unique. As revealed by buildfarm animal seawasp.
Back-patch to 11.
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGJWG7unNqmkxg7nC5o3o-0p2XP6co4r%3D9epqYMm8UY4Mw%40mail.gmail.com
Michael Paquier [Fri, 23 Aug 2019 11:41:13 +0000 (20:41 +0900)]
Improve documentation of pageinspect
This adds a section for heap-related functions. These were previously
mixed with functions having a more general purpose, leading to
confusion. While on it, add a query example for fsm_page_contents.
Backpatch down to 10, where b5e3942 introduced the subsections for
function types in pageinspect documentation.
Alvaro Herrera [Wed, 21 Aug 2019 15:12:44 +0000 (11:12 -0400)]
Fix typo
In early development patches, "replication origins" were called "identifiers";
almost everything was renamed, but these references to the old terminology
went unnoticed.
Tom Lane [Mon, 19 Aug 2019 22:00:57 +0000 (18:00 -0400)]
Restore json{b}_populate_record{set}'s ability to take type info from AS.
If the record argument is NULL and has no declared type more concrete
than RECORD, we can't extract useful information about the desired
rowtype from it. In this case, see if we're in FROM with an AS clause,
and if so extract the needed rowtype info from AS.
It worked like this before v11, but commit 37a795a60 removed the
behavior, reasoning that it was undocumented, inefficient, and utterly
not self-consistent. If you want to take type info from an AS clause,
you should be using the json_to_record() family of functions not the
json_populate_record() family. Also, it was already the case that
the "populate" functions would fail for a null-valued RECORD input
(with an unfriendly "record type has not been registered" error)
when there wasn't an AS clause at hand, and it wasn't obvious that
that behavior wasn't OK when there was one. However, it emerges
that some people were depending on this to work, and indeed the
rather off-point error message you got if you left off AS encouraged
slapping on AS without switching to the json_to_record() family.
Hence, put back the fallback behavior of looking for AS. While at it,
improve the run-time error you get when there's no place to obtain type
info; we can do a lot better than "record type has not been registered".
(We can't, unfortunately, easily improve the parse-time error message
that leads people down this path in the first place.)
While at it, I refactored the code a bit to avoid duplicating the
same logic in several different places.
Per bug #15940 from Jaroslav Sivy. Back-patch to v11 where the
current coding came in. (The pre-v11 deficiencies in this area
aren't regressions, so we'll leave those branches alone.)
Patch by me, based on preliminary analysis by Dmitry Dolgov.
Tom Lane [Sun, 18 Aug 2019 21:11:57 +0000 (17:11 -0400)]
Disallow changing an inherited column's type if not all parents changed.
If a table inherits from multiple unrelated parents, we must disallow
changing the type of a column inherited from multiple such parents, else
it would be out of step with the other parents. However, it's possible
for the column to ultimately be inherited from just one common ancestor,
in which case a change starting from that ancestor should still be
allowed. (I would not be excited about preserving that option, were
it not that we have regression test cases exercising it already ...)
It's slightly annoying that this patch looks different from the logic
with the same end goal in renameatt(), and more annoying that it
requires an extra syscache lookup to make the test. However, the
recursion logic is quite different in the two functions, and a
back-patched bug fix is no place to be trying to unify them.
Per report from Manuel Rigger. Back-patch to 9.5. The bug exists in
9.4 too (and doubtless much further back); but the way the recursion
is done in 9.4 is a good bit different, so that substantial refactoring
would be needed to fix it in 9.4. I'm disinclined to do that, or risk
introducing new bugs, for a bug that has escaped notice for this long.
Andres Freund [Fri, 16 Aug 2019 22:24:22 +0000 (15:24 -0700)]
Add default_table_access_method to postgresql.conf.sample.
Reported-By: Heikki Linnakangas
Author: Michael Paquier
Discussion: https://postgr.es/m/d6ffbebb-a0d2-181c-811d-b029b2225ed7@iki.fi
Backpatch: 12-, where pluggable table access methods were introduced
Tom Lane [Fri, 16 Aug 2019 00:04:19 +0000 (20:04 -0400)]
Prevent possible double-free when update trigger returns old tuple.
This is a variant of the problem fixed in commit 25b692568, which
unfortunately we failed to detect at the time. If an update trigger
returns the "old" tuple, as it's entitled to do, then a subsequent
iteration of the loop in ExecBRUpdateTriggers would have "oldtuple"
equal to "trigtuple" and would fail to notice that it shouldn't
free that.
In addition to fixing the code, extend the test case added by 25b692568 so that it covers multiple-trigger-iterations cases.
This problem does not manifest in v12/HEAD, as a result of the
relevant code having been largely rewritten for slotification.
However, include the test case into v12/HEAD anyway, since this
is clearly an area that someone could break again in future.
Per report from Piotr Gabriel Kosinski. Back-patch into all
supported branches, since the bug seems quite old.
Diagnosis and code fix by Thomas Munro, test case by me.
Tom Lane [Thu, 15 Aug 2019 19:21:47 +0000 (15:21 -0400)]
Fix plpgsql to re-look-up composite type names at need.
Commit 4b93f5799 rearranged things in plpgsql to make it cope better with
composite types changing underneath it intra-session. However, I failed to
consider the case of a composite type being dropped and recreated entirely.
In my defense, the previous coding didn't consider that possibility at all
either --- but it would accidentally work so long as you didn't change the
type's field list, because the built-at-compile-time list of component
variables would then still match the type's new definition. The new
coding, however, occasionally tries to re-look-up the type by OID, and
then fails to find the dropped type.
To fix this, we need to save the TypeName struct, and then redo the type
OID lookup from that. Of course that's expensive, so we don't want to do
it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye
on the type's typcache entry to see if its tupledesc has been invalidated.
(Perhaps, at some point, this mechanism should be generalized so it can
work for non-composite types too; but for now, plpgsql only tries to
cope with intra-session redefinitions of composites.)
I'm slightly hesitant to back-patch this into v11, because it changes
the contents of struct PLpgSQL_type as well as the signature of
plpgsql_build_datatype(), so in principle it could break code that is
poking into the innards of plpgsql. However, the only popular extension
of that ilk is pldebugger, and it doesn't seem to be affected. Since
this is a regression for people who were relying on the old behavior,
it seems worth taking the small risk of causing compatibility issues.
Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799
came in.
Tom Lane [Wed, 14 Aug 2019 19:09:20 +0000 (15:09 -0400)]
Fix ALTER SYSTEM to cope with duplicate entries in postgresql.auto.conf.
ALTER SYSTEM itself normally won't make duplicate entries (although
up till this patch, it was possible to confuse it by writing case
variants of a GUC's name). However, if some external tool has appended
entries to the file, that could result in duplicate entries for a single
GUC name. In such a situation, ALTER SYSTEM did exactly the wrong thing,
because it replaced or removed only the first matching entry, leaving
the later one(s) still there and hence still determining the active value.
This patch fixes that by making ALTER SYSTEM sweep through the file and
remove all matching entries, then (if not ALTER SYSTEM RESET) append the
new setting to the end. This means entries will be in order of last
setting rather than first setting, but that shouldn't hurt anything.
Also, make the comparisons case-insensitive so that the right things
happen if you do, say, ALTER SYSTEM SET "TimeZone" = 'whatever'.
This has been broken since ALTER SYSTEM was invented, so back-patch
to all supported branches.
Michael Paquier [Wed, 14 Aug 2019 04:38:16 +0000 (13:38 +0900)]
Fix random regression failure in test case "collate.icu.utf8"
This is a fix similar to 2d7d67cc, where slight plan alteration can
cause a random failure of this regression test because of an incorect
tuple ordering, except that this one involves lookups of pg_type.
Similarly to the other case, add ORDER BY clauses to ensure the output
order.
The failure has been seen at least once on buildfarm member skink.
Reported-by: Thomas Munro
Discussion: https://postgr.es/m/CA+hUKGLjR9ZBvhXcr9b-NSBHPw9aRgbjyzGE+kqLsT4vwX+nkQ@mail.gmail.com
Backpatch-through: 12
Tom Lane [Tue, 13 Aug 2019 20:57:58 +0000 (16:57 -0400)]
Un-break pg_dump for pre-8.3 source servers.
Commit 07b39083c inserted an unconditional reference to pg_opfamily,
which of course fails on servers predating that catalog. Fortunately,
the case it's trying to solve can't occur on such old servers (AFAIK).
Hence, just skip the additional code when the source predates 8.3.
Per bug #15955 from sly. Back-patch to all supported branches,
like the previous patch.
Michael Paquier [Tue, 13 Aug 2019 01:55:58 +0000 (10:55 +0900)]
Fix random regression failure in test case "temp"
This test case could fail because of an incorrect result ordering when
looking up at pg_class entries. This commit adds an ORDER BY to the
culprit query. The cause of the failure was likely caused by a plan
switch. By default, the planner would likely choose an index-only scan
or an index scan, but even a small change in the startup cost could have
caused a bitmap heap scan to be chosen, causing the failure.
While on it, switch some filtering quals to a regular expression as per
an idea of Tom Lane. As previously shaped, the quals would have
selected any relations whose name begins with "temp". And that could
cause failures if another test running in parallel began to use similar
relation names.
Per report from buildfarm member anole, though the failure was very
rare. This test has been introduced by 319a810, so backpatch down to
v10.
Peter Geoghegan [Mon, 12 Aug 2019 22:21:30 +0000 (15:21 -0700)]
amcheck: Skip unlogged relations during recovery.
contrib/amcheck failed to consider the possibility that unlogged
relations will not have any main relation fork files when running in hot
standby mode. This led to low-level "can't happen" errors that complain
about the absence of a relfilenode file.
To fix, simply skip verification of unlogged index relations during
recovery. In passing, add a direct check for the presence of a main
fork just before verification proper begins, so that we cleanly verify
the presence of the main relation fork file.
Author: Andrey Borodin, Peter Geoghegan Reported-By: Andrey Borodin Diagnosed-By: Andrey Borodin
Discussion: https://postgr.es/m/DA9B33AC-53CB-4643-96D4-7A0BBC037FA1@yandex-team.ru
Backpatch: 10-, where amcheck was introduced.
Tom Lane [Mon, 12 Aug 2019 17:15:47 +0000 (13:15 -0400)]
Fix planner's test for case-foldable characters in ILIKE with ICU.
As coded, the ICU-collation path in pattern_char_isalpha() failed
to consider regular ASCII letters to be case-varying. This led to
like_fixed_prefix treating too much of an ILIKE pattern as being a
fixed prefix, so that indexscans derived from an ILIKE clause might
miss entries that they should find.
Per bug #15892 from James Inform. This is an oversight in the original
ICU patch (commit eccfef81e), so back-patch to v10 where that came in.
Take into account pg_server_to_any() may return input string "as is".
Reported-by: Andrew Dunstan, Thomas Munro
Discussion: https://postgr.es/m/0ed83a33-d900-466a-880a-70ef456c721f%402ndQuadrant.com
Author: Alexander Korotkov, Thomas Munro
Backpatch-through: 12
We have implemented jsonpath string comparison using default database locale.
However, standard requires us to compare Unicode codepoints. This commit
implements that, but for performance reasons we still use per-byte comparison
for "==" operator. Thus, for consistency other comparison operators do per-byte
comparison if Unicode codepoints appear to be equal.
In some edge cases, when same Unicode codepoints have different binary
representations in database encoding, we diverge standard to achieve better
performance of "==" operator. In future to implement strict standard
conformance, we can do normalization of input JSON strings.
Original patch was written by Nikita Glukhov, rewritten by me.
Reported-by: Markus Winand
Discussion: https://postgr.es/m/8B7FA3B4-328D-43D7-95A8-37B8891B8C78%40winand.at
Author: Nikita Glukhov, Alexander Korotkov
Backpatch-through: 12
Tom Lane [Sat, 10 Aug 2019 15:30:11 +0000 (11:30 -0400)]
Fix "ANALYZE t, t" inside a transaction block.
This failed with either "tuple already updated by self" or "duplicate
key value violates unique constraint", depending on whether the table
had previously been analyzed or not. The reason is that ANALYZE tried
to insert or update the same pg_statistic rows twice, and there was no
CommandCounterIncrement between. So add one. The same case works fine
outside a transaction block, because then there's a whole transaction
boundary between, as a consequence of the way VACUUM works.
This issue has been latent all along, but the problem was unreachable
before commit 11d8d72c2 added the ability to specify multiple tables
in ANALYZE. We could, perhaps, alternatively fix it by adding code to
de-duplicate the list of VacuumRelations --- but that would add a
lot of overhead to work around dumb commands, so it's not attractive.
Per bug #15946 from Yaroslav Schekin. Back-patch to v11.
(Note: in v11 I also back-patched the test added by commit 23224563d;
otherwise the problem doesn't manifest in the test I added, because
"vactst" is empty when the tests for multiple ANALYZE targets are
reached. That seems like not a very good thing anyway, so I did this
rather than rethinking the choice of test case.)
Tom Lane [Wed, 7 Aug 2019 22:09:28 +0000 (18:09 -0400)]
Doc: document permissions required for ANALYZE.
VACUUM's reference page had this text, but ANALYZE's didn't. That's
a clear oversight given that section 5.7 explicitly delegates the
responsibility to define permissions requirements to the individual
commands' man pages.
Per gripe from Isaac Morland. Back-patch to all supported branches.
In serializable mode, heap_hot_search_buffer() incorrectly acquired a
predicate lock on the root tuple, not the returned tuple that satisfied
the visibility checks. As explained in README-SSI, the predicate lock does
not need to be copied or extended to other tuple versions, but for that to
work, the correct, visible, tuple version must be locked in the first
place.
The original SSI commit had this bug in it, but it was fixed back in 2013,
in commit 81fbbfe335. But unfortunately, it was reintroduced a few months
later in commit b89e151054. Wising up from that, add a regression test
to cover this, so that it doesn't get reintroduced again. Also, move the
code that sets 't_self', so that it happens at the same time that the
other HeapTuple fields are set, to make it more clear that all the code in
the loop operate on the "current" tuple in the chain, not the root tuple.
Bug spotted by Andres Freund, analysis and original fix by Thomas Munro,
test case and some additional changes to the fix by Heikki Linnakangas.
Backpatch to all supported versions (9.4).
Michael Paquier [Wed, 7 Aug 2019 09:17:34 +0000 (18:17 +0900)]
Fix some incorrect parsing of time with time zone strings
When parsing a timetz string with a dynamic timezone abbreviation or a
timezone not specified, it was possible to generate incorrect timestamps
based on a date which uses some non-initialized variables if the input
string did not specify fully a date to parse. This is already checked
when a full timezone spec is included in the input string, but the two
other cases mentioned above missed the same checks.
This gets fixed by generating an error as this input is invalid, or in
short when a date is not fully specified.
Valgrind was complaining about this problem.
Bug: #15910
Author: Alexander Lakhin
Discussion: https://postgr.es/m/15910-2eba5106b9aa0c61@postgresql.org
Backpatch-through: 9.4
Tom Lane [Tue, 6 Aug 2019 22:04:51 +0000 (18:04 -0400)]
Fix intarray's GiST opclasses to not fail for empty arrays with <@.
contrib/intarray considers "arraycol <@ constant-array" to be indexable,
but its GiST opclass code fails to reliably find index entries for empty
array values (which of course should trivially match such queries).
This is because the test condition to see whether we should descend
through a non-leaf node is wrong.
Unfortunately, empty array entries could be anywhere in the index,
as these index opclasses are currently designed. So there's no way
to fix this except by lobotomizing <@ indexscans to scan the whole
index ... which is what this patch does. That's pretty unfortunate:
the performance is now actually worse than a seqscan, in most cases.
We'd be better off to remove <@ from the GiST opclasses entirely,
and perhaps a future non-back-patchable patch will do so.
In the meantime, applications whose performance is adversely impacted
have a couple of options. They could switch to a GIN index, which
doesn't have this bug, or they could replace "arraycol <@ constant-array"
with "arraycol <@ constant-array AND arraycol && constant-array".
That will provide about the same performance as before, and it will find
all non-empty subsets of the given constant-array, which is all that
could reliably be expected of the query before.
While at it, add some more regression test cases to improve code
coverage of contrib/intarray.
In passing, adjust resize_intArrayType so that when it's returning an
empty array, it uses construct_empty_array for that rather than
cowboy hacking on the input array. While the hack produces an array
that looks valid for most purposes, it isn't bitwise equal to empty
arrays produced by other code paths, which could have subtle odd
effects. I don't think this code path is performance-critical
enough to justify such shortcuts. (Back-patch this part only as far
as v11; before commit 01783ac36 we were not careful about this in
other intarray code paths either.)
Back-patch the <@ fixes to all supported versions, since this was
broken from day one.
Patch by me; thanks to Alexander Korotkov for review.
Tom Lane [Tue, 6 Aug 2019 21:08:07 +0000 (17:08 -0400)]
Save Kerberos and LDAP daemon logs where the buildfarm can find them.
src/test/kerberos and src/test/ldap try to run private authentication
servers, which of course might fail. The logs from these servers
were being dropped into the tmp_check/ subdirectory, but they should
be put in tmp_check/log/, because the buildfarm will only capture
log files in that subdirectory. Without the log output there's
little hope of diagnosing buildfarm failures related to these servers.
Backpatch to v11 where these test suites were added.
Tom Lane [Mon, 5 Aug 2019 15:20:21 +0000 (11:20 -0400)]
Fix choice of comparison operators for cross-type hashed subplans.
Commit bf6c614a2 rearranged the lookup of the comparison operators
needed in a hashed subplan, and in so doing, broke the cross-type
case: it caused the original LHS-vs-RHS operator to be used to compare
hash table entries too (which of course are all of the RHS type).
This leads to C functions being passed a Datum that is not of the
type they expect, with the usual hazards of crashes and unauthorized
server memory disclosure.
For the set of hashable cross-type operators present in v11 core
Postgres, this bug is nearly harmless on 64-bit machines, which
may explain why it escaped earlier detection. But it is a live
security hazard on 32-bit machines; and of course there may be
extensions that add more hashable cross-type operators, which
would increase the risk.
Reported by Andreas Seltenreich. Back-patch to v11 where the
problem came in.
Noah Misch [Mon, 5 Aug 2019 14:48:41 +0000 (07:48 -0700)]
Require the schema qualification in pg_temp.type_name(arg).
Commit aa27977fe21a7dfa4da4376ad66ae37cb8f0d0b5 introduced this
restriction for pg_temp.function_name(arg); do likewise for types
created in temporary schemas. Programs that this breaks should add
"pg_temp." schema qualification or switch to arg::type_name syntax.
Back-patch to 9.4 (all supported versions).
As committed, statement sampling used the existing duration threshold
(log_min_duration_statement) when decide which statements to sample.
The issue is that even the longest statements are subject to sampling,
and so may not end up logged. An improvement was proposed, introducing
a second duration threshold, but it would not be backwards compatible.
So we've decided to revert this feature - the separate threshold should
be part of the feature itself.
As committed, statement sampling used the existing duration threshold
(log_min_duration_statement) when decide which statements to sample.
The issue is that even the longest statements are subject to sampling,
and so may not end up logged. An improvement was proposed, introducing
a second duration threshold, but it would not be backwards compatible.
So we've decided to revert this feature - the separate threshold should
be part of the feature itself.
Tom Lane [Sun, 4 Aug 2019 18:05:35 +0000 (14:05 -0400)]
Fix handling of "undef" in contrib/jsonb_plperl.
Perl has multiple internal representations of "undef", and just
testing for SvTYPE(x) == SVt_NULL doesn't recognize all of them,
leading to "cannot transform this Perl type to jsonb" errors.
Use the approved test SvOK() instead.
Report and patch by Ivan Panchenko. Back-patch to v11 where
this module was added.
Tom Lane [Sun, 4 Aug 2019 17:07:12 +0000 (13:07 -0400)]
Avoid picking already-bound TCP ports in kerberos and ldap test suites.
src/test/kerberos and src/test/ldap need to run a private authentication
server of the relevant type, for which they need a free TCP port.
They were just picking a random port number in 48K-64K, which works
except when something's already using the particular port. Notably,
the probability of failure rises dramatically if one simply runs those
tests in a tight loop, because each test cycle leaves behind a bunch of
high ports that are transiently in TIME_WAIT state.
To fix, split out the code that PostgresNode.pm already had for
identifying a free TCP port number, so that it can be invoked to choose
a port for the KDC or LDAP server. This isn't 100% bulletproof, since
conceivably something else on the machine could grab the port between
the time we check and the time we actually start the server. But that's
a pretty short window, so in practice this should be good enough.
Back-patch to v11 where these test suites were added.
Alvaro Herrera [Sun, 4 Aug 2019 15:18:45 +0000 (11:18 -0400)]
Improve pruning of a default partition
When querying a partitioned table containing a default partition, we
were wrongly deciding to include it in the scan too early in the
process, failing to exclude it in some cases. If we reinterpret the
PruneStepResult.scan_default flag slightly, we can do a better job at
detecting that it can be excluded. The change is that we avoid setting
the flag for that pruning step unless the step absolutely requires the
default partition to be scanned (in contrast with the previous
arrangement, which was to set it unless the step was able to prune it).
So get_matching_partitions() must explicitly check the partition that
each returned bound value corresponds to in order to determine whether
the default one needs to be included, rather than relying on the flag
from the final step result.
Author: Yuzuko Hosoya <hosoya.yuzuko@lab.ntt.co.jp> Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Discussion: https://postgr.es/m/00e601d4ca86$932b8bc0$b982a340$@lab.ntt.co.jp
Andres Freund [Fri, 2 Aug 2019 07:02:49 +0000 (00:02 -0700)]
Fix representation of hash keys in Hash/HashJoin nodes.
In 5f32b29c1819 I changed the creation of HashState.hashkeys to
actually use HashState as the parent (instead of HashJoinState, which
was incorrect, as they were executed below HashState), to fix the
problem of hashkeys expressions otherwise relying on slot types
appropriate for HashJoinState, rather than HashState as would be
correct. That reliance was only introduced in 12, which is why it
previously worked to use HashJoinState as the parent (although I'd be
unsurprised if there were problematic cases).
Unfortunately that's not a sufficient solution, because before this
commit, the to-be-hashed expressions referenced inner/outer as
appropriate for the HashJoin, not Hash. That didn't have obvious bad
consequences, because the slots containing the tuples were put into
ecxt_innertuple when hashing a tuple for HashState (even though Hash
doesn't have an inner plan).
There are less common cases where this can cause visible problems
however (rather than just confusion when inspecting such executor
trees). E.g. "ERROR: bogus varno: 65000", when explaining queries
containing a HashJoin where the subsidiary Hash node's hash keys
reference a subplan. While normally hashkeys aren't displayed by
EXPLAIN, if one of those expressions references a subplan, that
subplan may be printed as part of the Hash node - which then failed
because an inner plan was referenced, and Hash doesn't have that.
It seems quite possible that there's other broken cases, too.
Fix the problem by properly splitting the expression for the HashJoin
and Hash nodes at plan time, and have them reference the proper
subsidiary node. While other workarounds are possible, fixing this
correctly seems easy enough. It was a pretty ugly hack to have
ExecInitHashJoin put the expression into the already initialized
HashState, in the first place.
I decided to not just split inner/outer hashkeys inside
make_hashjoin(), but also to separate out hashoperators and
hashcollations at plan time. Otherwise we would have ended up having
two very similar loops, one at plan time and the other during executor
startup. The work seems to more appropriately belong to plan time,
anyway.
Reported-By: Nikita Glukhov, Alexander Korotkov
Author: Andres Freund Reviewed-By: Tom Lane, in an earlier version
Discussion: https://postgr.es/m/CAPpHfdvGVegF_TKKRiBrSmatJL2dR9uwFCuR+teQ_8tEXU8mxg@mail.gmail.com
Backpatch: 12-
Michael Paquier [Thu, 1 Aug 2019 00:37:48 +0000 (09:37 +0900)]
Fix handling of previous password hooks in passwordcheck
When piling up loading of modules using check_password_hook_type,
loading passwordcheck would remove any trace of a previously-loaded
hook. Unloading the module would also cause previous hooks to be
entirely gone.
Reported-by: Rafael Castro
Author: Michael Paquier Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/15932-78f48f9ef166778c@postgresql.org
Backpatch-through: 9.4
Tom Lane [Wed, 31 Jul 2019 19:42:50 +0000 (15:42 -0400)]
Fix pg_dump's handling of dependencies for custom opclasses.
Since pg_dump doesn't treat the member operators and functions of operator
classes/families (that is, the pg_amop and pg_amproc entries, not the
underlying operators/functions) as separate dumpable objects, it missed
their dependency information. I think this was safe when the code was
designed, because the default object sorting rule emits operators and
functions before opclasses, and there were no dependency types that could
mess that up. However, the introduction of range types in 9.2 broke it:
now a type can have a dependency on an opclass, allowing dependency rules
to push the opclass before the type and hence before custom operators.
Lacking any information showing that it shouldn't do so, pg_dump emitted
the objects in the wrong order.
Fix by teaching getDependencies() to translate pg_depend entries for
pg_amop/amproc rows to look like dependencies for their parent opfamily.
I added a regression test for this in HEAD/v12, but not further back;
life is too short to fight with 002_pg_dump.pl.
Per bug #15934 from Tom Gottfried. Back-patch to all supported branches.
Andres Freund [Wed, 31 Jul 2019 07:05:21 +0000 (00:05 -0700)]
Remove superfluous newlines in function prototypes.
These were introduced by pgindent due to fixe to broken
indentation (c.f. 8255c7a5eeba8). Previously the mis-indentation of
function prototypes was creatively used to reduce indentation in a few
places.
As that formatting only exists in master and REL_12_STABLE, it seems
better to fix it in both, rather than having some odd indentation in
v12 that somebody might copy for future patches or such.
Author: Andres Freund
Discussion: https://postgr.es/m/20190728013754.jwcbe5nfyt3533vx@alap3.anarazel.de
Backpatch: 12-
The rd_amcache allows an index AM to cache arbitrary information in a
relcache entry. This commit moves the cleanup of rd_amcache so that it
can also be used by table AMs. Nothing takes advantage of that yet, but
I'm sure it'll come handy for anyone writing new table AMs.
Backpatch to v12, where table AM interface was introduced.
Print WAL position correctly in pg_rewind error message.
This has been wrong ever since pg_rewind was added. The if-branch just
above this, where we print the same error with an extra message supplied
by XLogReadRecord() got this right, but the variable name was wrong in the
else-branch. As a consequence, the error printed the WAL position as
0/0 if there was an error reading a WAL file.
Tomas Vondra [Tue, 30 Jul 2019 17:17:12 +0000 (19:17 +0200)]
Don't build extended statistics on inheritance trees
When performing ANALYZE on inheritance trees, we collect two samples for
each relation - one for the relation alone, and one for the inheritance
subtree (relation and its child relations). And then we build statistics
on each sample, so for each relation we get two sets of statistics.
For regular (per-column) statistics this works fine, because the catalog
includes a flag differentiating statistics built from those two samples.
But we don't have such flag in the extended statistics catalogs, and we
ended up updating the same row twice, triggering this error:
ERROR: tuple already updated by self
The simplest solution is to disable extended statistics on inheritance
trees, which is what this commit is doing. In the future we may need to
do something similar to per-column statistics, but that requires adding a
flag to the catalog - and that's not backpatchable. Moreover, the current
selectivity estimation code only works with individual relations, so
building statistics on inheritance trees would be pointless anyway.
Author: Tomas Vondra Backpatch-to: 10-
Discussion: https://postgr.es/m/20190618231233.GA27470@telsasoft.com Reported-by: Justin Pryzby
Tom Lane [Mon, 29 Jul 2019 22:49:04 +0000 (18:49 -0400)]
Fix busted logic for parallel lock grouping in TopoSort().
A "break" statement erroneously left behind by commit a1c1af2a1
caused TopoSort to do the wrong thing if a lock's wait list
contained multiple members of the same locking group.
Because parallel workers don't normally need any locks not already
taken by their leader, this is very hard --- maybe impossible ---
to hit in production. Still, if it did happen, the queries involved
in an otherwise-resolvable deadlock would block until canceled.
In addition to removing the bogus "break", add an Assert showing
that the conflicting uses of the beforeConstraints[] array (for both
counts and flags) don't overlap, and add some commentary explaining
why not; because it's not obvious without explanation, IMHO.
Original report and patch from Rui Hai Jiang; additional assert
and commentary by me. Back-patch to 9.6 where the bug came in.
Michael Paquier [Mon, 29 Jul 2019 00:58:49 +0000 (09:58 +0900)]
Fix handling of expressions and predicates in REINDEX CONCURRENTLY
When copying the definition of an index rebuilt concurrently for the new
entry, the index information was taken directly from the old index using
the relation cache. In this case, predicates and expressions have
some post-processing to prepare things for the planner, which loses some
information including the collations added in any of them.
This inconsistency can cause issues when attempting for example a table
rewrite, and makes the new indexes rebuilt concurrently inconsistent
with the old entries.
In order to fix the problem, fetch expressions and predicates directly
from the catalog of the old entry, and fill in IndexInfo for the new
index with that. This makes the process more consistent with
DefineIndex(), and the code is refactored with the addition of a routine
to create an IndexInfo node.
Reported-by: Manuel Rigger
Author: Michael Paquier
Discussion: https://postgr.es/m/CA+u7OA5Hp0ra235F3czPom_FyAd-3+XwSJmX95r1+sRPOJc9VQ@mail.gmail.com
Backpatch-through: 12
Thomas Munro [Sun, 28 Jul 2019 22:12:37 +0000 (10:12 +1200)]
Avoid macro clash with LLVM 9.
Early previews of LLVM 9 reveal that our Min() macro causes compiler
errors in LLVM headers reached by the #include directives in
llvmjit_inline.cpp. Let's just undefine it. Per buildfarm animal
seawasp. Back-patch to 11.
Reviewed-by: Fabien Coelho, Tom Lane
Discussion: https://postgr.es/m/20190606173216.GA6306%40alvherre.pgsql
Tom Lane [Fri, 26 Jul 2019 17:07:08 +0000 (13:07 -0400)]
Tweak our special-case logic for the IANA "Factory" timezone.
pg_timezone_names() tries to avoid showing the "Factory" zone in
the view, mainly because that has traditionally had a very long
"abbreviation" such as "Local time zone must be set--see zic manual page",
so that showing it messes up psql's formatting of the whole view.
Since tzdb version 2016g, IANA instead uses the abbreviation "-00",
which is sane enough that there's no reason to discriminate against it.
On the other hand, it emerges that FreeBSD and possibly other packagers
are so wedded to backwards compatibility that they hack the IANA data
to keep the old spelling --- and not just that old spelling, but even
older spellings that IANA used back in the stone age. This caused the
filter logic to fail to suppress "Factory" at all on such platforms,
though the formatting problem is definitely real in that case.
To solve both problems, get rid of the hard-wired assumption about
exactly what Factory's abbreviation is, and instead reject abbreviations
exceeding 31 characters. This will allow Factory to appear in the view
if and only if it's using the modern abbreviation.
In passing, simplify the code we add to zic.c to support "zic -P"
to remove its now-obsolete hacks to not print the Factory zone's
abbreviation. Unlike pg_timezone_names(), there's no reason for
that code to support old/nonstandard timezone data.
Since we generally prefer to keep timezone-related behavior the
same in all branches, and since this is arguably a bug fix,
back-patch to all supported branches.
Tom Lane [Fri, 26 Jul 2019 16:45:32 +0000 (12:45 -0400)]
Avoid choosing "localtime" or "posixrules" as TimeZone during initdb.
Some platforms create a file named "localtime" in the system
timezone directory, making it a copy or link to the active time
zone file. If Postgres is built with --with-system-tzdata, initdb
will see that file as an exact match to localtime(3)'s behavior,
and it may decide that "localtime" is the most preferred spelling of
the active zone. That's a very bad choice though, because it's
neither informative, nor portable, nor stable if someone changes
the system timezone setting. Extend the preference logic added by
commit e3846a00c so that we will prefer any other zone file that
matches localtime's behavior over "localtime".
On the same logic, also discriminate against "posixrules", which
is another not-really-a-zone file that is often present in the
timezone directory. (Since we install "posixrules" but not
"localtime", this change can affect the behavior of Postgres
with or without --with-system-tzdata.)
Note that this change doesn't prevent anyone from choosing these
pseudo-zones if they really want to (i.e., by setting TZ for initdb,
or modifying the timezone GUC later on). It just prevents initdb
from preferring these zone names when there are multiple matches to
localtime's behavior.
Since we generally prefer to keep timezone-related behavior the
same in all branches, and since this is arguably a bug fix,
back-patch to all supported branches.
Tom Lane [Fri, 26 Jul 2019 15:59:00 +0000 (11:59 -0400)]
Fix loss of fractional digits for large values in cash_numeric().
Money values exceeding about 18 digits (depending on lc_monetary)
could be inaccurately converted to numeric, due to select_div_scale()
deciding it didn't need to compute any fractional digits. Force
its hand by setting the dscale of one division input to equal the
number of fractional digits we need.
In passing, rearrange the logic to not do useless work in locales
where money values are considered integral.
Per bug #15925 from Slawomir Chodnicki. Back-patch to all supported
branches.
Thomas Munro [Thu, 25 Jul 2019 22:01:18 +0000 (10:01 +1200)]
Fix LDAP test instability.
After starting slapd, wait until it can accept a connection before
beginning the real test work. This avoids occasional test failures.
Back-patch to 11, where the LDAP tests arrived.
Author: Thomas Munro Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20190719033013.GI1859%40paquier.xyz
Andres Freund [Thu, 25 Jul 2019 21:22:52 +0000 (14:22 -0700)]
Fix slot type handling for Agg nodes performing internal sorts.
Since 15d8f8312 we assert that - and since 7ef04e4d2cb2, 4da597edf1
rely on - the slot type for an expression's
ecxt_{outer,inner,scan}tuple not changing, unless explicitly flagged
as such. That allows to either skip deforming (for a virtual tuple
slot) or optimize the code for JIT accelerated deforming
appropriately (for other known slot types).
This assumption was sometimes violated for grouping sets, when
nodeAgg.c internally uses tuplesorts, and the child node doesn't
return a TTSOpsMinimalTuple type slot. Detect that case, and flag that
the outer slot might not be "fixed".
It's probably worthwhile to optimize this further in the future, and
more granularly determine whether the slot is fixed. As we already
instantiate per-phase transition and equal expressions, we could
cheaply set the slot type appropriately for each phase. But that's a
separate change from this bugfix.
This commit does include a very minor optimization by avoiding to
create a slot for handling tuplesorts, if no such sorts are
performed. Previously we created that slot unnecessarily in the common
case of computing all grouping sets via hashing. The code looked too
confusing without that, as the conditions for needing a sort slot and
flagging that the slot type isn't fixed, are the same.
Reported-By: Ashutosh Sharma
Author: Andres Freund
Discussion: https://postgr.es/m/CAE9k0PmNaMD2oHTEAhRyxnxpaDaYkuBYkLa1dpOpn=RS0iS2AQ@mail.gmail.com
Backpatch: 12-, where the bug was introduced in 15d8f8312
Tom Lane [Thu, 25 Jul 2019 16:10:54 +0000 (12:10 -0400)]
Fix failures to ignore \r when reading Windows-style newlines.
libpq failed to ignore Windows-style newlines in connection service files.
This normally wasn't a problem on Windows itself, because fgets() would
convert \r\n to just \n. But if libpq were running inside a program that
changes the default fopen mode to binary, it would see the \r's and think
they were data. In any case, it's project policy to ignore \r in text
files unconditionally, because people sometimes try to use files with
DOS-style newlines on Unix machines, where the C library won't hide that
from us.
Hence, adjust parseServiceFile() to ignore \r as well as \n at the end of
the line. In HEAD, go a little further and make it ignore all trailing
whitespace, to match what it's always done with leading whitespace.
In HEAD, also run around and fix up everyplace where we have
newline-chomping code to make all those places look consistent and
uniformly drop \r. It is not clear whether any of those changes are
fixing live bugs. Most of the non-cosmetic changes are in places that
are reading popen output, and the jury is still out as to whether popen
on Windows can return \r\n. (The Windows-specific code in pipe_read_line
seems to think so, but our lack of support for this elsewhere suggests
maybe it's not a problem in practice.) Hence, I desisted from applying
those changes to back branches, except in run_ssl_passphrase_command()
which is new enough and little-tested enough that we'd probably not have
heard about any problems there.
Tom Lane and Michael Paquier, per bug #15827 from Jorge Gustavo Rocha.
Back-patch the parseServiceFile() change to all supported branches,
and the run_ssl_passphrase_command() change to v11 where that was added.
Andrew Dunstan [Thu, 25 Jul 2019 15:24:23 +0000 (11:24 -0400)]
Honor MSVC WindowsSDKVersion if set
Add a line to the project file setting the target SDK. Otherwise, in for
example VS2017, if the default but optional 8.1 SDK is not installed the
build will fail.
Tom Lane [Thu, 25 Jul 2019 15:02:43 +0000 (11:02 -0400)]
Fix contrib/sepgsql test policy to work with latest SELinux releases.
As of Fedora 30, it seems that the system-provided macros for setting
up user privileges in SELinux policies don't grant the ability to read
/etc/passwd, as they formerly did. This restriction breaks psql
(which tries to use getpwuid() to obtain the user name it's running
under) and thereby the contrib/sepgsql regression test. Add explicit
specifications that we need the right to read /etc/passwd.
Mike Palmiotto, per a report from me. Back-patch to all supported
branches.
Andres Freund [Thu, 25 Jul 2019 01:45:58 +0000 (18:45 -0700)]
Fix system column accesses in ON CONFLICT ... RETURNING.
After 277cb789836 ON CONFLICT ... SET ... RETURNING failed with
ERROR: virtual tuple table slot does not have system attributes
when taking the update path, as the slot used to insert into the
table (and then process RETURNING) was defined to be a virtual slot in
that commit. Virtual slots don't support system columns except for
tableoid and ctid, as the other system columns are AM dependent.
Fix that by using a slot of the table's type. Add tests for system
column accesses in ON CONFLICT ... RETURNING.
Reported-By: Roby, bisected to the relevant commit by Jeff Janes
Author: Andres Freund
Discussion: https://postgr.es/m/73436355-6432-49B1-92ED-1FE4F7E7E100@finefun.com.au
Backpatch: 12-, where the bug was introduced in 277cb789836
Tom Lane [Wed, 24 Jul 2019 22:14:26 +0000 (18:14 -0400)]
Fix infelicities in describeOneTableDetails' partitioned-table handling.
describeOneTableDetails issued a partition-constraint-fetching query
for every table, even ones it knows perfectly well are not partitions.
To add insult to injury, it then proceeded to leak the empty PGresult
if the table wasn't a partition. Doing that a lot of times might
amount to a meaningful leak, so this seems like a back-patchable bug.
Fix that, and also fix a related PGresult leak in the partition-parent
case (though that leak would occur only if we got no row, which is
unexpected).
Minor code beautification too, to make this code look more like the
pre-existing code around it.
Back-patch the whole change into v12. However, the fact that we already
know whether the table is a partition dates only to commit 1af25ca0c;
back-patching the relevant changes from that is probably more churn
than is justified in released branches. Hence, in v11 and v10, just
do the minimum to fix the PGresult leaks.
Noted while messing around with adjacent code for yesterday's \d
improvements.
Use full 64-bit XID for checking if a deleted GiST page is old enough.
Otherwise, after a deleted page gets even older, it becomes unrecyclable
again. B-tree has the same problem, and has had since time immemorial,
but let's at least fix this in GiST, where this is new.
Backpatch to v12, where GiST page deletion was introduced.