Tom Lane [Mon, 26 Aug 2019 19:59:44 +0000 (15:59 -0400)]
Fix postmaster state machine to handle dead_end child crashes better.
A report from Alvaro Herrera shows that if we're in PM_STARTUP
state, and we spawn a dead_end child to reject some incoming
connection request, and that child dies with an unexpected exit
code, the postmaster does not respond well. We correctly send
SIGQUIT to the startup process, but then:
* if the startup process exits with nonzero exit code, as expected,
we thought that that indicated a crash and aborted startup.
* if the startup process exits with zero exit code, which is possible
due to the inherent race condition, we'd advance to PM_RUN state
which is fine --- but the code forgot that AbortStartTime would be
nonzero in this situation. We'd either die on the Asserts saying
that it was zero, or perhaps misbehave later on. (A quick look
suggests that the only misbehavior might be busy-waiting due to
DetermineSleepTime doing the wrong thing.)
To fix the first point, adjust the state-machine logic to recognize
that a nonzero exit code is expected after sending SIGQUIT, and have
it transition to a state where we can restart the startup process.
To fix the second point, change the Asserts to clear the variable
rather than just claiming it should be clear already.
Perhaps we could improve this further by not treating a crash of
a dead_end child as a reason for panic'ing the database. However,
since those child processes are connected to shared memory, that
seems a bit risky. There are few good reasons for a dead_end child
to report failure anyway (the cause of this in Alvaro's report is
quite unclear). On balance, therefore, a minimal fix seems best.
This is an oversight in commit 45811be94. While that was back-patched,
I'm hesitant to back-patch this change. The lack of reasons for a
dead_end child to fail suggests that the case should be very rare in
the field, which squares with the lack of reports; so it seems like
this might not be worth the risk of introducing new issues. In any
case we can let it bake awhile in HEAD before considering a back-patch.
Andrew Dunstan [Mon, 26 Aug 2019 12:11:27 +0000 (08:11 -0400)]
Adjust to latest Msys2 kernel release number
Previously 'uname -r' on Msys2 reported a kernele release starting with
2. The latest version starts with 3. In commit 1638623f we specifically
looked for one starting with 2. This is now changed to look for any
digit between 2 and 9.
Andrew Dunstan [Mon, 26 Aug 2019 11:44:34 +0000 (07:44 -0400)]
Treat MINGW and MSYS the same in pg_upgrade test script
On msys2, 'uname -s' reports a string starting MSYS instead on MINGW
as happens on msys1. Treat these both the same way. This reverts 608a710195a4b in favor of a more general solution.
Michael Paquier [Mon, 26 Aug 2019 02:14:18 +0000 (11:14 +0900)]
Fix error handling of vacuumdb and reindexdb when running out of fds
When trying to use a high number of jobs, vacuumdb (and more recently
reindexdb) has only checked for a maximum number of jobs used, causing
confusing failures when running out of file descriptors when the jobs
open connections to Postgres. This commit changes the error handling so
as we do not check anymore for a maximum number of allowed jobs when
parsing the option value with FD_SETSIZE, but check instead if a file
descriptor is within the supported range when opening the connections
for the jobs so as this is detected at the earliest time possible.
Also, improve the error message to give a hint about the number of jobs
recommended, using a wording given by the reviewers of the patch.
Reported-by: Andres Freund
Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de
Backpatch-through: 9.5
Tom Lane [Sun, 25 Aug 2019 19:04:04 +0000 (15:04 -0400)]
Avoid platform-specific null pointer dereference in psql.
POSIX permits getopt() to advance optind beyond argc when the last
argv entry is an option that requires an argument and hasn't got one.
It seems that no major platforms actually do that, but musl does,
so that something like "psql -f" would crash with that libc.
Add a check that optind is in range before trying to look at the
possibly-bogus option.
Report and fix by Quentin Rameau. Back-patch to all supported
branches.
Tom Lane [Sun, 25 Aug 2019 16:14:50 +0000 (12:14 -0400)]
Back off output precision in circle.sql regression test.
We were setting extra_float_digits = 0 to avoid platform-dependent
output in this test, but that's still able to expose platform-specific
roundoff behavior in some new test cases added by commit a3d284485,
as reported by Peter Eisentraut. Reduce it to -1 to hide that.
(Over in geometry.sql, we're using -3, which is an ancient decision
dating to 337f73b1b. I wonder whether that's overkill now. But
there's probably little value in trying to change it.)
Back-patch to v12 where a3d284485 came in; there's no evidence that
we have any platform-dependent issues here before that.
Thomas Munro [Sun, 25 Aug 2019 01:54:48 +0000 (13:54 +1200)]
Don't rely on llvm::make_unique.
Bleeding-edge LLVM has stopped supplying replacements for various
C++14 library features, for people on older C++ versions. Since we're
not ready to require C++14 yet, just use plain old new instead of
make_unique. As revealed by buildfarm animal seawasp.
Back-patch to 11.
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGJWG7unNqmkxg7nC5o3o-0p2XP6co4r%3D9epqYMm8UY4Mw%40mail.gmail.com
Peter Geoghegan [Sat, 24 Aug 2019 03:24:49 +0000 (20:24 -0700)]
Explain subtlety in nbtree locking protocol.
The Postgres approach to coupling locks during an ascent of the tree is
slightly different to the approach taken by Lehman and Yao. Add a new
paragraph to the "Differences to the Lehman & Yao algorithm" section of
the nbtree README that explains the similarities and differences.
Michael Paquier [Sat, 24 Aug 2019 02:45:05 +0000 (11:45 +0900)]
Detect unused steps in isolation specs and do some cleanup
This is useful for developers to find out if an isolation spec is
over-engineered or if it needs more work by warning at the end of a
test run if a step is not used, generating a failure with extra diffs.
While on it, clean up all the specs which include steps not used in any
permutations to simplify them.
Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz
Michael Paquier [Sat, 24 Aug 2019 02:35:43 +0000 (11:35 +0900)]
Remove dry-run mode from isolationtester
The original purpose of the dry-run mode is to be able to print all the
possible permutations from a spec file, but it has become less useful
since isolation tests has improved regarding deadlock detection as one
step not wanted by the author could block indefinitely now (originally
the step blocked would have been detected rather quickly). Per
discussion, let's remove it.
Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz
Michael Paquier [Fri, 23 Aug 2019 11:41:06 +0000 (20:41 +0900)]
Improve documentation of pageinspect
This adds a section for heap-related functions. These were previously
mixed with functions having a more general purpose, leading to
confusion. While on it, add a query example for fsm_page_contents.
Backpatch down to 10, where b5e3942 introduced the subsections for
function types in pageinspect documentation.
Peter Eisentraut [Wed, 21 Aug 2019 19:33:05 +0000 (21:33 +0200)]
Remove configure detection of crypt()
crypt() hasn't been needed since crypt detection was removed from
PostgreSQL, so these configure checks are not necessary.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/21f88934-f00c-27f6-a9d8-7ea06d317781%402ndquadrant.com
Alvaro Herrera [Wed, 21 Aug 2019 15:12:44 +0000 (11:12 -0400)]
Fix typo
In early development patches, "replication origins" were called "identifiers";
almost everything was renamed, but these references to the old terminology
went unnoticed.
Tom Lane [Wed, 21 Aug 2019 14:43:23 +0000 (10:43 -0400)]
Remove unnecessary test dependency on the contents of pg_pltemplate.
Using pg_pltemplate as test data was probably not very forward-looking,
considering we've had many discussions around removing that catalog
altogether. Use a nearby temp table instead, to make these two test
scripts more self-contained. This is a better test case anyway, since
it exercises the scenario where the entries in the anyarray column
actually vary in type intra-query.
Peter Eisentraut [Tue, 20 Aug 2019 20:25:58 +0000 (22:25 +0200)]
Clean up some SCRAM attribute processing
Correct the comment for read_any_attr(). Give a clearer error message
when parsing at the end of the string, when the client-final-message
does not contain a "p" attribute (for some reason).
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/2fb8a15b-de35-682d-a77b-edcc9c52fa12%402ndquadrant.com
Michael Paquier [Tue, 20 Aug 2019 07:10:20 +0000 (16:10 +0900)]
Fix compilation failure of vacuumdb and reindexdb with OpenBSD
FD_SETSIZE is included in sys/select.h per POSIX, and this header
inclusion has been moved to scripts_parallel.c as of 5f38403 without
moving the variable, causing a compilation failure on recent versions of
OpenBSD (6.6 was the version used in the report).
In order to take care of the failure, move FD_SETSIZE directly to
scripts_parallel.c with a wrapper controlling the maximum number of
parallel slots supported, based on a suggestion by Andres Freund.
While on it, reduce the maximum number to be less than FD_SETSIZE,
leaving some room for stdin, stdout and such as they consume some file
descriptors.
The buildfarm did not complain about that, as it happens to only be
an issue on recent versions of OpenBSD and there is no coverage in this
area. 51c3e9f fixed a similar set of issues.
Bug: #15964 Reported-by: Sean Farrell
Discussion: https://postgr.es/m/15964-c1753bdfed722e04@postgresql.org
Tom Lane [Mon, 19 Aug 2019 22:00:57 +0000 (18:00 -0400)]
Restore json{b}_populate_record{set}'s ability to take type info from AS.
If the record argument is NULL and has no declared type more concrete
than RECORD, we can't extract useful information about the desired
rowtype from it. In this case, see if we're in FROM with an AS clause,
and if so extract the needed rowtype info from AS.
It worked like this before v11, but commit 37a795a60 removed the
behavior, reasoning that it was undocumented, inefficient, and utterly
not self-consistent. If you want to take type info from an AS clause,
you should be using the json_to_record() family of functions not the
json_populate_record() family. Also, it was already the case that
the "populate" functions would fail for a null-valued RECORD input
(with an unfriendly "record type has not been registered" error)
when there wasn't an AS clause at hand, and it wasn't obvious that
that behavior wasn't OK when there was one. However, it emerges
that some people were depending on this to work, and indeed the
rather off-point error message you got if you left off AS encouraged
slapping on AS without switching to the json_to_record() family.
Hence, put back the fallback behavior of looking for AS. While at it,
improve the run-time error you get when there's no place to obtain type
info; we can do a lot better than "record type has not been registered".
(We can't, unfortunately, easily improve the parse-time error message
that leads people down this path in the first place.)
While at it, I refactored the code a bit to avoid duplicating the
same logic in several different places.
Per bug #15940 from Jaroslav Sivy. Back-patch to v11 where the
current coding came in. (The pre-v11 deficiencies in this area
aren't regressions, so we'll leave those branches alone.)
Patch by me, based on preliminary analysis by Dmitry Dolgov.
Tom Lane [Mon, 19 Aug 2019 18:22:56 +0000 (14:22 -0400)]
Add "headerscheck" script to test header-file compilability under C.
We already had "cpluspluscheck", which served the dual purposes of
verifying that headers compile standalone and that they compile as C++.
However, C++ compilers don't have the exact same set of error conditions
as C compilers, so this doesn't really prove that a header will compile
standalone as C.
Hence, add a second script that's largely similar but runs the C
compiler not C++.
Also add a bit more documentation than the none-at-all we had before.
Tom Lane [Mon, 19 Aug 2019 17:17:02 +0000 (13:17 -0400)]
Use zic's new "-b slim" option to generate smaller timezone files.
IANA tzcode release 2019b adds an option that tells zic not to emit
the old 32-bit section of the timezone files, and to skip some other
space-wasting hacks needed for compatibility with old timezone client
libraries. Since we only expect our own code to use the timezone data
we install, and our code is up-to-date with 2019b, there's no apparent
reason not to generate the smallest possible files.
Unfortunately, while the individual zone files do get significantly
smaller in many cases, they were not that big to begin with; which
means that no real space savings ensues on filesystems that don't
optimize small files. (For instance, on ext4 with 4K block size,
"du" says the installed timezone tree is the same size as before.)
Still, it seems worth making the change, if only because this is
presumably the wave of the future. At the very least, we'll save
some cycles while reading a zone file.
But given the marginal value and the fact that this is a new code
path, it doesn't seem worth the risk of back-patching this change
into stable branches. Hence, unlike most of our timezone-related
changes, apply to HEAD only.
Tom Lane [Sun, 18 Aug 2019 23:27:23 +0000 (19:27 -0400)]
Avoid conflicts with library versions of inet_net_ntop() and friends.
Prefix inet_net_ntop and sibling routines with "pg_" to ensure that
they aren't mistaken for C-library functions. This fixes warnings
from cpluspluscheck on some platforms, and should help reduce reader
confusion everywhere, since our functions aren't exactly interchangeable
with the library versions (they may have different ideas about address
family codes).
This shouldn't be fixing any actual bugs, unless somebody's linker
is misbehaving, so no need to back-patch.
Tom Lane [Sun, 18 Aug 2019 23:01:40 +0000 (19:01 -0400)]
Fix incidental warnings from cpluspluscheck.
Remove use of "register" keyword in hashfn.c. It's obsolescent
according to recent C++ compilers, and no modern C compiler pays
much attention to it either.
Also fix one cosmetic warning about signed vs unsigned comparison.
Tom Lane [Sun, 18 Aug 2019 21:51:35 +0000 (17:51 -0400)]
Fix failure-to-compile-standalone in ecpg's dt.h.
This has to have <time.h>, or the references to "struct tm" don't
mean what they should.
We have some other recently-introduced issues of the same ilk,
but this one seems old. No backpatch though, as it's only a
latent problem for most purposes.
Tom Lane [Sun, 18 Aug 2019 21:11:57 +0000 (17:11 -0400)]
Disallow changing an inherited column's type if not all parents changed.
If a table inherits from multiple unrelated parents, we must disallow
changing the type of a column inherited from multiple such parents, else
it would be out of step with the other parents. However, it's possible
for the column to ultimately be inherited from just one common ancestor,
in which case a change starting from that ancestor should still be
allowed. (I would not be excited about preserving that option, were
it not that we have regression test cases exercising it already ...)
It's slightly annoying that this patch looks different from the logic
with the same end goal in renameatt(), and more annoying that it
requires an extra syscache lookup to make the test. However, the
recursion logic is quite different in the two functions, and a
back-patched bug fix is no place to be trying to unify them.
Per report from Manuel Rigger. Back-patch to 9.5. The bug exists in
9.4 too (and doubtless much further back); but the way the recursion
is done in 9.4 is a good bit different, so that substantial refactoring
would be needed to fix it in 9.4. I'm disinclined to do that, or risk
introducing new bugs, for a bug that has escaped notice for this long.
Tom Lane [Sat, 17 Aug 2019 22:15:38 +0000 (18:15 -0400)]
Make deadlock-parallel isolation test more robust.
This test failed fairly reproducibly on some CLOBBER_CACHE_ALWAYS
buildfarm animals. The cause seems to be that if a parallel worker
is slow enough to reach its lock wait, it may not be released by
the first deadlock check run, and then later deadlock checks might
decide to unblock the d2 session instead of the d1 session, leaving
us in an undetected deadlock state (since the isolationtester client
is waiting for d1 to complete first).
Fix by introducing an additional lock wait at the end of the d2a1
step, ensuring that the deadlock checker will recognize that d1
has to be unblocked before d2a1 completes.
Also reduce max_parallel_workers_per_gather to 3 in this test. With the
default max_worker_processes value, we were only getting one parallel
worker for the d2a1 step, which is not the case I hoped to test. We
should get 3 for d1a2 and 2 for d2a1, as the code stands; and maybe 3
for d2a1 if somebody figures out why the last parallel worker slot isn't
free already.
Peter Eisentraut [Sat, 17 Aug 2019 10:36:30 +0000 (12:36 +0200)]
Improve Assert output
If an assertion expression contained a macro, the failed assertion
message would print the expanded macro, which is usually unhelpful and
confusing. Restructure the Assert macros to not expand any macros
when constructing the failure message.
This also fixes that the existing output for Assert et al. shows
the *inverted* condition, which is also confusing and not how
assertions usually work.
Andres Freund [Fri, 16 Aug 2019 22:24:22 +0000 (15:24 -0700)]
Add default_table_access_method to postgresql.conf.sample.
Reported-By: Heikki Linnakangas
Author: Michael Paquier
Discussion: https://postgr.es/m/d6ffbebb-a0d2-181c-811d-b029b2225ed7@iki.fi
Backpatch: 12-, where pluggable table access methods were introduced
Andres Freund [Fri, 16 Aug 2019 17:35:31 +0000 (10:35 -0700)]
Remove fmgr.h includes from headers that don't really need it.
Most of the fmgr.h includes were obsoleted by 352a24a1f9d6f7d4abb1. A
few others can be obsoleted using the underlying struct type in an
implementation detail.
Author: Andres Freund
Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de
Andres Freund [Fri, 16 Aug 2019 17:33:30 +0000 (10:33 -0700)]
Don't include utils/array.h from acl.h.
For most uses of acl.h the details of how "Acl" internally looks like
are irrelevant. It might make sense to move a lot of the
implementation details into a separate header at a later point.
The main motivation of this change is to avoid including fmgr.h (via
array.h, which needs it for exposed structs) in a lot of files that
otherwise don't need it. A subsequent commit will remove the fmgr.h
include from a lot of files.
Directly include utils/array.h and utils/expandeddatum.h from the
files that need them, but previously included them indirectly, via
acl.h.
Author: Andres Freund
Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de
Tom Lane [Fri, 16 Aug 2019 00:04:19 +0000 (20:04 -0400)]
Prevent possible double-free when update trigger returns old tuple.
This is a variant of the problem fixed in commit 25b692568, which
unfortunately we failed to detect at the time. If an update trigger
returns the "old" tuple, as it's entitled to do, then a subsequent
iteration of the loop in ExecBRUpdateTriggers would have "oldtuple"
equal to "trigtuple" and would fail to notice that it shouldn't
free that.
In addition to fixing the code, extend the test case added by 25b692568 so that it covers multiple-trigger-iterations cases.
This problem does not manifest in v12/HEAD, as a result of the
relevant code having been largely rewritten for slotification.
However, include the test case into v12/HEAD anyway, since this
is clearly an area that someone could break again in future.
Per report from Piotr Gabriel Kosinski. Back-patch into all
supported branches, since the bug seems quite old.
Diagnosis and code fix by Thomas Munro, test case by me.
Tom Lane [Thu, 15 Aug 2019 19:21:47 +0000 (15:21 -0400)]
Fix plpgsql to re-look-up composite type names at need.
Commit 4b93f5799 rearranged things in plpgsql to make it cope better with
composite types changing underneath it intra-session. However, I failed to
consider the case of a composite type being dropped and recreated entirely.
In my defense, the previous coding didn't consider that possibility at all
either --- but it would accidentally work so long as you didn't change the
type's field list, because the built-at-compile-time list of component
variables would then still match the type's new definition. The new
coding, however, occasionally tries to re-look-up the type by OID, and
then fails to find the dropped type.
To fix this, we need to save the TypeName struct, and then redo the type
OID lookup from that. Of course that's expensive, so we don't want to do
it every time we need the type OID. This can be fixed in the same way that 4b93f5799 dealt with changes to composite types' definitions: keep an eye
on the type's typcache entry to see if its tupledesc has been invalidated.
(Perhaps, at some point, this mechanism should be generalized so it can
work for non-composite types too; but for now, plpgsql only tries to
cope with intra-session redefinitions of composites.)
I'm slightly hesitant to back-patch this into v11, because it changes
the contents of struct PLpgSQL_type as well as the signature of
plpgsql_build_datatype(), so in principle it could break code that is
poking into the innards of plpgsql. However, the only popular extension
of that ilk is pldebugger, and it doesn't seem to be affected. Since
this is a regression for people who were relying on the old behavior,
it seems worth taking the small risk of causing compatibility issues.
Per bug #15913 from Daniel Fiori. Back-patch to v11 where 4b93f5799
came in.
Tom Lane [Thu, 15 Aug 2019 16:22:12 +0000 (12:22 -0400)]
Use a hash table to de-duplicate NOTIFY events faster.
Previously, async.c got rid of duplicate notifications by scanning
the list of pending events to compare each one to the proposed new
event. This works okay for very small numbers of distinct events,
but degrades as O(N^2) for many events. We can improve matters by
using a hash table to probe for duplicates. So as not to add a
lot of overhead for the simple cases that the code did handle well
before, create the hash table only once a (sub)transaction has
queued more than 16 distinct notify events.
A downside is that we now have to do per-event work to propagate
a successful subtransaction's notify events up to its parent.
(But this isn't significant unless the subtransaction had many
events, in which case the O(N^2) behavior would have been in
play already, so we still come out ahead.)
We can make some lemonade out of this lemon, though: since we must
examine each event anyway, it's now possible to de-duplicate events
fully, rather than skipping that for events merged up from
subtransactions. Hence, remove the old weasel wording in notify.sgml
about whether de-duplication happens or not, and adjust the test
case in async-notify.spec that exhibited the old behavior.
While at it, rearrange the definition of struct Notification to make
it more compact and require just one palloc per event, rather than
two or three. This saves space when there are a lot of events,
in fact more than enough to buy back the space needed for the hash
table.
Patch by me, based on discussions around a different patch
submitted by Filip Rembiałkowski.
Tom Lane [Wed, 14 Aug 2019 19:09:19 +0000 (15:09 -0400)]
Fix ALTER SYSTEM to cope with duplicate entries in postgresql.auto.conf.
ALTER SYSTEM itself normally won't make duplicate entries (although
up till this patch, it was possible to confuse it by writing case
variants of a GUC's name). However, if some external tool has appended
entries to the file, that could result in duplicate entries for a single
GUC name. In such a situation, ALTER SYSTEM did exactly the wrong thing,
because it replaced or removed only the first matching entry, leaving
the later one(s) still there and hence still determining the active value.
This patch fixes that by making ALTER SYSTEM sweep through the file and
remove all matching entries, then (if not ALTER SYSTEM RESET) append the
new setting to the end. This means entries will be in order of last
setting rather than first setting, but that shouldn't hurt anything.
Also, make the comparisons case-insensitive so that the right things
happen if you do, say, ALTER SYSTEM SET "TimeZone" = 'whatever'.
This has been broken since ALTER SYSTEM was invented, so back-patch
to all supported branches.
Peter Geoghegan [Wed, 14 Aug 2019 18:32:35 +0000 (11:32 -0700)]
Remove block number field from nbtree stack.
The initial value of the nbtree stack downlink block number field
recorded during an initial descent of the tree wasn't actually used.
Both _bt_getstackbuf() callers overwrote the value with their own value.
Remove the block number field from the stack struct, and add a child
block number argument to _bt_getstackbuf() in its place. This makes the
overall design of _bt_getstackbuf() clearer.
Author: Peter Geoghegan Reviewed-By: Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzmx+UbXt2YNOUCZ-a04VdXU=S=OHuAuD7Z8uQq-PXTYUg@mail.gmail.com
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/eeaf2f99-a1a6-8aca-3f43-9ab0b2fb112a%402ndquadrant.com
Michael Paquier [Wed, 14 Aug 2019 04:37:48 +0000 (13:37 +0900)]
Fix random regression failure in test case "collate.icu.utf8"
This is a fix similar to 2d7d67cc, where slight plan alteration can
cause a random failure of this regression test because of an incorect
tuple ordering, except that this one involves lookups of pg_type.
Similarly to the other case, add ORDER BY clauses to ensure the output
order.
The failure has been seen at least once on buildfarm member skink.
Reported-by: Thomas Munro
Discussion: https://postgr.es/m/CA+hUKGLjR9ZBvhXcr9b-NSBHPw9aRgbjyzGE+kqLsT4vwX+nkQ@mail.gmail.com
Backpatch-through: 12
Peter Geoghegan [Wed, 14 Aug 2019 00:16:44 +0000 (17:16 -0700)]
Remove obsolete nbtree README commentary.
Commit d2086b08b02 removed almost all cases where nbtree must release a
read buffer lock and acquire a write buffer lock instead, so remaining
cases in which that's still necessary are not notable enough to appear
in the nbtree README.
More importantly, holding on to a buffer pin in cases where nbtree must
trade a read lock for a write lock is very unlikely to save any I/O.
This seems to have been a long overlooked throwback to a time when
nbtree cared about write-ordering dependencies, and performed
synchronous buffer writes. It hasn't worked that way in many years.
Tom Lane [Tue, 13 Aug 2019 20:57:58 +0000 (16:57 -0400)]
Un-break pg_dump for pre-8.3 source servers.
Commit 07b39083c inserted an unconditional reference to pg_opfamily,
which of course fails on servers predating that catalog. Fortunately,
the case it's trying to solve can't occur on such old servers (AFAIK).
Hence, just skip the additional code when the source predates 8.3.
Per bug #15955 from sly. Back-patch to all supported branches,
like the previous patch.
Peter Geoghegan [Tue, 13 Aug 2019 18:54:26 +0000 (11:54 -0700)]
Use PageIndexTupleOverwrite() within nbtree.
Use the PageIndexTupleOverwrite() bufpage.c routine within nbtree
instead of deleting a tuple and re-inserting its replacement. This
makes the intent of affected code slightly clearer. It also makes
CREATE INDEX slightly faster, since there is no longer a need to shift
every leaf page's line pointer array back and forth during index builds.
Author: Peter Geoghegan, Anastasia Lubennikova Reviewed-By: Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wz=Zk=B9+Vwm376WuO7YTjFc2SSskifQm4Nme3RRRPtOSQ@mail.gmail.com
Alvaro Herrera [Tue, 13 Aug 2019 14:26:04 +0000 (10:26 -0400)]
Don't constraint-exclude partitioned tables as much
We only need to invoke constraint exclusion on partitioned tables when
they are a partition, and they themselves contain a default partition;
it's not necessary otherwise, and it's expensive, so avoid it. Also, we
were trying once for each clause separately, but we can do it for all
the clauses at once.
While at it, centralize setting of RelOptInfo->partition_qual instead of
computing it in slightly different ways in different places.
Per complaints from Simon Riggs about 4e85642d935e; reviewed by Yuzuko
Hosoya, Kyotaro Horiguchi.
Author: Amit Langote. I (Álvaro) again mangled the patch somewhat.
Discussion: https://postgr.es/m/CANP8+j+tMCY=nEcQeqQam85=uopLBtX-2vHiLD2bbp7iQQUKpA@mail.gmail.com
Michael Paquier [Tue, 13 Aug 2019 04:53:41 +0000 (13:53 +0900)]
Fix inconsistencies and typos in the tree, take 10
This addresses some issues with unnecessary code comments, fixes various
typos in docs and comments, and removes some orphaned structures and
definitions.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
Michael Paquier [Tue, 13 Aug 2019 01:55:19 +0000 (10:55 +0900)]
Fix random regression failure in test case "temp"
This test case could fail because of an incorrect result ordering when
looking up at pg_class entries. This commit adds an ORDER BY to the
culprit query. The cause of the failure was likely caused by a plan
switch. By default, the planner would likely choose an index-only scan
or an index scan, but even a small change in the startup cost could have
caused a bitmap heap scan to be chosen, causing the failure.
While on it, switch some filtering quals to a regular expression as per
an idea of Tom Lane. As previously shaped, the quals would have
selected any relations whose name begins with "temp". And that could
cause failures if another test running in parallel began to use similar
relation names.
Per report from buildfarm member anole, though the failure was very
rare. This test has been introduced by 319a810, so backpatch down to
v10.
Peter Geoghegan [Mon, 12 Aug 2019 22:21:32 +0000 (15:21 -0700)]
amcheck: Skip unlogged relations during recovery.
contrib/amcheck failed to consider the possibility that unlogged
relations will not have any main relation fork files when running in hot
standby mode. This led to low-level "can't happen" errors that complain
about the absence of a relfilenode file.
To fix, simply skip verification of unlogged index relations during
recovery. In passing, add a direct check for the presence of a main
fork just before verification proper begins, so that we cleanly verify
the presence of the main relation fork file.
Author: Andrey Borodin, Peter Geoghegan Reported-By: Andrey Borodin Diagnosed-By: Andrey Borodin
Discussion: https://postgr.es/m/DA9B33AC-53CB-4643-96D4-7A0BBC037FA1@yandex-team.ru
Backpatch: 10-, where amcheck was introduced.
Tom Lane [Mon, 12 Aug 2019 17:15:47 +0000 (13:15 -0400)]
Fix planner's test for case-foldable characters in ILIKE with ICU.
As coded, the ICU-collation path in pattern_char_isalpha() failed
to consider regular ASCII letters to be case-varying. This led to
like_fixed_prefix treating too much of an ILIKE pattern as being a
fixed prefix, so that indexscans derived from an ILIKE clause might
miss entries that they should find.
Per bug #15892 from James Inform. This is an oversight in the original
ICU patch (commit eccfef81e), so back-patch to v10 where that came in.
Tom Lane [Mon, 12 Aug 2019 15:58:35 +0000 (11:58 -0400)]
Remove EState.es_range_table_array.
Now that list_nth is O(1), there's no good reason to maintain a
separate array of RTE pointers rather than indexing into
estate->es_range_table. Deleting the array doesn't save all that
much either; but just on cleanliness grounds, it's better not to
have duplicate representations of the identical information.
Tom Lane [Mon, 12 Aug 2019 15:20:18 +0000 (11:20 -0400)]
Rationalize use of list_concat + list_copy combinations.
In the wake of commit 1cff1b95a, the result of list_concat no longer
shares the ListCells of the second input. Therefore, we can replace
"list_concat(x, list_copy(y))" with just "list_concat(x, y)".
To improve call sites that were list_copy'ing the first argument,
or both arguments, invent "list_concat_copy()" which produces a new
list sharing no ListCells with either input. (This is a bit faster
than "list_concat(list_copy(x), y)" because it makes the result list
the right size to start with.)
In call sites that were not list_copy'ing the second argument, the new
semantics mean that we are usually leaking the second List's storage,
since typically there is no remaining pointer to it. We considered
inventing another list_copy variant that would list_free the second
input, but concluded that for most call sites it isn't worth worrying
about, given the relative compactness of the new List representation.
(Note that in cases where such leakage would happen, the old code
already leaked the second List's header; so we're only discussing
the size of the leak not whether there is one. I did adjust two or
three places that had been troubling to free that header so that
they manually free the whole second List.)
Take into account pg_server_to_any() may return input string "as is".
Reported-by: Andrew Dunstan, Thomas Munro
Discussion: https://postgr.es/m/0ed83a33-d900-466a-880a-70ef456c721f%402ndQuadrant.com
Author: Alexander Korotkov, Thomas Munro
Backpatch-through: 12
Tom Lane [Sun, 11 Aug 2019 22:55:32 +0000 (18:55 -0400)]
Partially revert "Insert temporary debugging output in regression tests."
This reverts much of commit f03a9ca4366d064d89b7cf7ed75d4e43f2ed0667,
but leaves the relpages/reltuples probe in select_parallel.sql.
The pg_stat_all_tables probes are unstable enough to be annoying,
and it no longer seems likely that they will teach us anything more
about the underlying problem. I'd still like some more confirmation
though that the observed plan instability is caused by VACUUM leaving
relpages/reltuples as zero for one of these tables.
We have implemented jsonpath string comparison using default database locale.
However, standard requires us to compare Unicode codepoints. This commit
implements that, but for performance reasons we still use per-byte comparison
for "==" operator. Thus, for consistency other comparison operators do per-byte
comparison if Unicode codepoints appear to be equal.
In some edge cases, when same Unicode codepoints have different binary
representations in database encoding, we diverge standard to achieve better
performance of "==" operator. In future to implement strict standard
conformance, we can do normalization of input JSON strings.
Original patch was written by Nikita Glukhov, rewritten by me.
Reported-by: Markus Winand
Discussion: https://postgr.es/m/8B7FA3B4-328D-43D7-95A8-37B8891B8C78%40winand.at
Author: Nikita Glukhov, Alexander Korotkov
Backpatch-through: 12
Tom Lane [Sat, 10 Aug 2019 15:30:11 +0000 (11:30 -0400)]
Fix "ANALYZE t, t" inside a transaction block.
This failed with either "tuple already updated by self" or "duplicate
key value violates unique constraint", depending on whether the table
had previously been analyzed or not. The reason is that ANALYZE tried
to insert or update the same pg_statistic rows twice, and there was no
CommandCounterIncrement between. So add one. The same case works fine
outside a transaction block, because then there's a whole transaction
boundary between, as a consequence of the way VACUUM works.
This issue has been latent all along, but the problem was unreachable
before commit 11d8d72c2 added the ability to specify multiple tables
in ANALYZE. We could, perhaps, alternatively fix it by adding code to
de-duplicate the list of VacuumRelations --- but that would add a
lot of overhead to work around dumb commands, so it's not attractive.
Per bug #15946 from Yaroslav Schekin. Back-patch to v11.
(Note: in v11 I also back-patched the test added by commit 23224563d;
otherwise the problem doesn't manifest in the test I added, because
"vactst" is empty when the tests for multiple ANALYZE targets are
reached. That seems like not a very good thing anyway, so I did this
rather than rethinking the choice of test case.)
Peter Geoghegan [Sat, 10 Aug 2019 00:06:45 +0000 (17:06 -0700)]
Rename tuplesort.c's SortTuple.tupindex field.
Rename the "tupindex" field from tuplesort.c's SortTuple struct to
"srctape", since it can only ever be used to store a source/input tape
number when merging external sort runs. This has been the case since
commit 8b304b8b72b, which removed replacement selection sort from
tuplesort.c.
Tom Lane [Fri, 9 Aug 2019 16:33:43 +0000 (12:33 -0400)]
Cosmetic improvements in setup of planner's per-RTE arrays.
Merge setup_append_rel_array into setup_simple_rel_arrays. There's no
particularly good reason to keep them separate, and it's inconsistent
with the lack of separation in expand_planner_arrays. The only apparent
benefit was that the fast path for trivial queries in query_planner()
doesn't need to set up the append_rel_array; but all we're saving there
is an if-test and NULL assignment, which surely ought to be negligible.
Michael Paquier [Fri, 9 Aug 2019 02:05:14 +0000 (11:05 +0900)]
Refactor logic to remove trailing CR/LF characters from strings
b654714 has reworked the way trailing CR/LF characters are removed from
strings. This commit introduces a new routine in common/string.c and
refactors the code so as the logic is in a single place, mostly.
Author: Michael Paquier Reviewed-by: Bruce Momjian
Discussion: https://postgr.es/m/20190801031820.GF29334@paquier.xyz
Peter Geoghegan [Thu, 8 Aug 2019 20:20:44 +0000 (13:20 -0700)]
Update obsolete tuplesort READTUP() comment.
READTUP() routines do not and cannot use the resettable "tuplecontext"
memory context, since it is deleted when merging begins. Update an
obsolete comment that claimed otherwise. This was an oversight in
commit e94568ecc10.
I left PG_CMD_PUTS around even though it could be handled by
PG_CMD_PRINTF since PG_CMD_PUTS is sometimes called with non-literal
arguments, and so that would create a potential problem if such a
string contained percent signs.
Tom Lane [Wed, 7 Aug 2019 22:09:28 +0000 (18:09 -0400)]
Doc: document permissions required for ANALYZE.
VACUUM's reference page had this text, but ANALYZE's didn't. That's
a clear oversight given that section 5.7 explicitly delegates the
responsibility to define permissions requirements to the individual
commands' man pages.
Per gripe from Isaac Morland. Back-patch to all supported branches.
Alvaro Herrera [Wed, 7 Aug 2019 16:21:22 +0000 (12:21 -0400)]
Apply constraint exclusion more generally in partitioning
We were applying constraint exclusion on the partition constraint when
generating pruning steps for a clause, but only for the rather
restricted situation of them being boolean OR operators; however it is
possible to have differently shaped clauses that also benefit from
constraint exclusion. This applies particularly to the default
partition since their constraints are in essence a long list of OR'ed
subclauses ... but it applies to other cases too. So in certain cases
we're scanning partitions that we don't need to.
Remove the specialized code in OR clauses, and add a generally
applicable test of the clause refuting the partition constraint; mark
the whole pruning operation as contradictory if it hits.
This has the unwanted side-effect of testing some (most? all?)
constraints more than once if constraint_exclusion=on. That seems
unavoidable as far as I can tell without some additional work, but
that's not the recommended setting for that parameter anyway.
However, because this imposes additional processing cost for all
queries using partitioned tables, I decided not to backpatch this
change.
Author: Amit Langote, Yuzuko Hosoya, Álvaro Herrera
Reviewers: Shawn Wang, Thibaut Madeleine, Yoshikazu Imai, Kyotaro
Horiguchi; they were also uncredited reviewers for commit 489247b0e615.
Discussion: https://postgr.es/m/9bb31dfe-b0d0-53f3-3ea6-e64b811424cf@lab.ntt.co.jp
In serializable mode, heap_hot_search_buffer() incorrectly acquired a
predicate lock on the root tuple, not the returned tuple that satisfied
the visibility checks. As explained in README-SSI, the predicate lock does
not need to be copied or extended to other tuple versions, but for that to
work, the correct, visible, tuple version must be locked in the first
place.
The original SSI commit had this bug in it, but it was fixed back in 2013,
in commit 81fbbfe335. But unfortunately, it was reintroduced a few months
later in commit b89e151054. Wising up from that, add a regression test
to cover this, so that it doesn't get reintroduced again. Also, move the
code that sets 't_self', so that it happens at the same time that the
other HeapTuple fields are set, to make it more clear that all the code in
the loop operate on the "current" tuple in the chain, not the root tuple.
Bug spotted by Andres Freund, analysis and original fix by Thomas Munro,
test case and some additional changes to the fix by Heikki Linnakangas.
Backpatch to all supported versions (9.4).
Michael Paquier [Wed, 7 Aug 2019 09:16:31 +0000 (18:16 +0900)]
Fix some incorrect parsing of time with time zone strings
When parsing a timetz string with a dynamic timezone abbreviation or a
timezone not specified, it was possible to generate incorrect timestamps
based on a date which uses some non-initialized variables if the input
string did not specify fully a date to parse. This is already checked
when a full timezone spec is included in the input string, but the two
other cases mentioned above missed the same checks.
This gets fixed by generating an error as this input is invalid, or in
short when a date is not fully specified.
Valgrind was complaining about this problem.
Bug: #15910
Author: Alexander Lakhin
Discussion: https://postgr.es/m/15910-2eba5106b9aa0c61@postgresql.org
Backpatch-through: 9.4
Michael Paquier [Wed, 7 Aug 2019 01:28:16 +0000 (10:28 +0900)]
Adjust tuple data lookup logic in multi-insert logical decoding
As of now, logical decoding of a multi-insert has been scanning all
xl_multi_insert_tuple entries only if XLH_INSERT_CONTAINS_NEW_TUPLE was
getting set in the record. This is not an issue on HEAD as multi-insert
records are not used for system catalogs, but the logical decoding logic
includes all the code necessary to handle that properly, except that the
code missed to iterate correctly over all xl_multi_insert_tuple entries
when the flag is not set. Hence, when trying to use multi-insert for
system catalogs, an assertion would be triggered.
An upcoming patch is going to make use of multi-insert for system
catalogs, and this fixes the logic to make sure that all entries are
scanned correctly without softening the existing assertions.
Reported-by: Daniel Gustafsson
Author: Michael Paquier Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/CBFFD532-C033-49EB-9A5A-F67EAEE9EB0B@yesql.se
Tom Lane [Tue, 6 Aug 2019 22:04:51 +0000 (18:04 -0400)]
Fix intarray's GiST opclasses to not fail for empty arrays with <@.
contrib/intarray considers "arraycol <@ constant-array" to be indexable,
but its GiST opclass code fails to reliably find index entries for empty
array values (which of course should trivially match such queries).
This is because the test condition to see whether we should descend
through a non-leaf node is wrong.
Unfortunately, empty array entries could be anywhere in the index,
as these index opclasses are currently designed. So there's no way
to fix this except by lobotomizing <@ indexscans to scan the whole
index ... which is what this patch does. That's pretty unfortunate:
the performance is now actually worse than a seqscan, in most cases.
We'd be better off to remove <@ from the GiST opclasses entirely,
and perhaps a future non-back-patchable patch will do so.
In the meantime, applications whose performance is adversely impacted
have a couple of options. They could switch to a GIN index, which
doesn't have this bug, or they could replace "arraycol <@ constant-array"
with "arraycol <@ constant-array AND arraycol && constant-array".
That will provide about the same performance as before, and it will find
all non-empty subsets of the given constant-array, which is all that
could reliably be expected of the query before.
While at it, add some more regression test cases to improve code
coverage of contrib/intarray.
In passing, adjust resize_intArrayType so that when it's returning an
empty array, it uses construct_empty_array for that rather than
cowboy hacking on the input array. While the hack produces an array
that looks valid for most purposes, it isn't bitwise equal to empty
arrays produced by other code paths, which could have subtle odd
effects. I don't think this code path is performance-critical
enough to justify such shortcuts. (Back-patch this part only as far
as v11; before commit 01783ac36 we were not careful about this in
other intarray code paths either.)
Back-patch the <@ fixes to all supported versions, since this was
broken from day one.
Patch by me; thanks to Alexander Korotkov for review.
Tom Lane [Tue, 6 Aug 2019 21:08:07 +0000 (17:08 -0400)]
Save Kerberos and LDAP daemon logs where the buildfarm can find them.
src/test/kerberos and src/test/ldap try to run private authentication
servers, which of course might fail. The logs from these servers
were being dropped into the tmp_check/ subdirectory, but they should
be put in tmp_check/log/, because the buildfarm will only capture
log files in that subdirectory. Without the log output there's
little hope of diagnosing buildfarm failures related to these servers.
Backpatch to v11 where these test suites were added.
Peter Geoghegan [Mon, 5 Aug 2019 18:47:34 +0000 (11:47 -0700)]
Show specific OID suggestion in unused_oids output.
Commit a6417078 established a new project policy around OID assignment:
new patches are encouraged to choose a random OID in the 8000..9999
range when a manually-assigned OID is required (if multiple OIDs are
required, a consecutive block of OIDs starting from the random point
should be used). Catalog entries added by committed patches that use
OIDs from this "unstable" range are renumbered after feature freeze.
This practice minimizes OID collisions among concurrently-developed
patches.
Show a specific random OID suggestion when the unused_oids script is
run. This makes it easy for patch authors to use a random OID from the
unstable range, per the new policy.
Author: Julien Rouhaud, Peter Geoghegan Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/CAH2-WzkkRs2ScmuBQ7xWi7xzp7fC1B3w0Nt8X+n4rBw5k+Z=zA@mail.gmail.com
Tom Lane [Mon, 5 Aug 2019 15:20:21 +0000 (11:20 -0400)]
Fix choice of comparison operators for cross-type hashed subplans.
Commit bf6c614a2 rearranged the lookup of the comparison operators
needed in a hashed subplan, and in so doing, broke the cross-type
case: it caused the original LHS-vs-RHS operator to be used to compare
hash table entries too (which of course are all of the RHS type).
This leads to C functions being passed a Datum that is not of the
type they expect, with the usual hazards of crashes and unauthorized
server memory disclosure.
For the set of hashable cross-type operators present in v11 core
Postgres, this bug is nearly harmless on 64-bit machines, which
may explain why it escaped earlier detection. But it is a live
security hazard on 32-bit machines; and of course there may be
extensions that add more hashable cross-type operators, which
would increase the risk.
Reported by Andreas Seltenreich. Back-patch to v11 where the
problem came in.
Noah Misch [Mon, 5 Aug 2019 14:48:41 +0000 (07:48 -0700)]
Require the schema qualification in pg_temp.type_name(arg).
Commit aa27977fe21a7dfa4da4376ad66ae37cb8f0d0b5 introduced this
restriction for pg_temp.function_name(arg); do likewise for types
created in temporary schemas. Programs that this breaks should add
"pg_temp." schema qualification or switch to arg::type_name syntax.
Back-patch to 9.4 (all supported versions).
Michael Paquier [Mon, 5 Aug 2019 06:35:16 +0000 (15:35 +0900)]
Add safeguards in LSN, numeric and float calculation for custom errors
Those data types use parsing and/or calculation wrapper routines which
can generate some generic error messages in the event of a failure. The
caller of these routines can also pass a pointer variable settable by
the routine to track if an error has happened, letting the caller decide
what to do in the event of an error and what error message to generate.
Those routines have been slacking the initialization of the tracking
flag, which can be confusing when reading the code, so add some
safeguards against calls of these parsing routines which could lead to a
dubious result.
The LSN parsing gains an assertion to make sure that the tracking flag
is set, while numeric and float paths initialize the flag to a saner
state.
Author: Jeevan Ladhe Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/CAOgcT0NOM9oR0Hag_3VpyW0uF3iCU=BDUFSPfk9JrWXRcWQHqw@mail.gmail.com
As committed, statement sampling used the existing duration threshold
(log_min_duration_statement) when decide which statements to sample.
The issue is that even the longest statements are subject to sampling,
and so may not end up logged. An improvement was proposed, introducing
a second duration threshold, but it would not be backwards compatible.
So we've decided to revert this feature - the separate threshold should
be part of the feature itself.
As committed, statement sampling used the existing duration threshold
(log_min_duration_statement) when decide which statements to sample.
The issue is that even the longest statements are subject to sampling,
and so may not end up logged. An improvement was proposed, introducing
a second duration threshold, but it would not be backwards compatible.
So we've decided to revert this feature - the separate threshold should
be part of the feature itself.
Tom Lane [Sun, 4 Aug 2019 18:05:34 +0000 (14:05 -0400)]
Fix handling of "undef" in contrib/jsonb_plperl.
Perl has multiple internal representations of "undef", and just
testing for SvTYPE(x) == SVt_NULL doesn't recognize all of them,
leading to "cannot transform this Perl type to jsonb" errors.
Use the approved test SvOK() instead.
Report and patch by Ivan Panchenko. Back-patch to v11 where
this module was added.