Peter Eisentraut [Mon, 29 Oct 2018 10:39:44 +0000 (11:39 +0100)]
Exclude temporary directories from pgindent
Exclude tmp_check and tmp_install from pgindent. In a fully-built
tree, pgindent would spend a lot of time digging through these
directories and ends up re-indenting installed header files.
Michael Paquier [Mon, 29 Oct 2018 07:38:54 +0000 (16:38 +0900)]
Improve description of pg_attrdef in documentation
The reference to pg_attribute is switched to a link, which is more
useful for the html documentation. The conditions under which a default
value is defined for a given column are made more general.
Author: Daniel Gustafsson Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/0E8748E3-8B7D-445E-9ABA-09DA5C7345CC@yesql.se
Andrew Dunstan [Sun, 28 Oct 2018 16:22:32 +0000 (12:22 -0400)]
Fix perl searchpath for modern perl for MSVC tools
Modern versions of perl no longer include the current directory in the
perl searchpath, as it's insecure. Instead of adding the current
directory, we get around the problem by adding the directory where the
script lives.
Michael Paquier [Fri, 26 Oct 2018 04:46:20 +0000 (13:46 +0900)]
Improve tab completion of CREATE EVENT TRIGGER in psql
This adds tab completion of the clauses WHEN and EXECUTE
FUNCTION|PROCEDURE clauses to CREATE EVENT TRIGGER, similar to CREATE
TRIGGER in the previous commit. This has version-dependent logic so as
FUNCTION is chosen over PROCEDURE for 11 and newer versions.
Author: Dagfinn Ilmari Mannsåker Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/d8jmur4q4yc.fsf@dalvik.ping.uio.no
Michael Paquier [Fri, 26 Oct 2018 00:30:43 +0000 (09:30 +0900)]
Add tab completion of EXECUTE FUNCTION for CREATE TRIGGER in psql
The change to accept EXECUTE FUNCTION as well as EXECUTE PROCEDURE in
CREATE TRIGGER (added by 0a63f99) forgot to tell psql's tab completion
system about this. In passing, add tab completion of EXECUTE
FUNCTION/PROCEDURE after a complete WHEN ( … ) clause.
This change is version-aware, with FUNCTION being selected automatically
instead of PROCEDURE depending on the backend version, PROCEDURE being
an historical grammar kept for compatibility and considered as
deprecated in v11.
Author: Dagfinn Ilmari Mannsåker Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/d8jmur4q4yc.fsf@dalvik.ping.uio.no
Michael Paquier [Thu, 25 Oct 2018 00:46:00 +0000 (09:46 +0900)]
Add pg_promote function
This function is able to promote a standby with this new SQL-callable
function. Execution access can be granted to non-superusers so that
failover tools can observe the principle of least privilege.
Andrew Dunstan [Wed, 24 Oct 2018 14:45:34 +0000 (10:45 -0400)]
Correctly set t_self for heap tuples in expand_tuple
Commit 16828d5c0 incorrectly set an invalid pointer for t_self for heap
tuples. This patch correctly copies it from the source tuple, and
includes a regression test that relies on it being set correctly.
Michael Paquier [Wed, 24 Oct 2018 08:02:37 +0000 (17:02 +0900)]
List wait events in alphabetical order
This changes the documentation, and the related structures so as
everything is consistent.
Some wait events were not listed alphabetically since their
introduction, others have been added rather randomly. Keeping all those
entries in order helps in maintenance, and helps the user looking at the
documentation.
Author: Michael Paquier, Kuntal Ghosh
Discussion: https://postgr.es/m/20181024002539.GI1658@paquier.xyz
Backpatch-through: 10, only for the documentation part to avoid an ABI
breakage.
Michael Paquier [Mon, 22 Oct 2018 02:04:48 +0000 (11:04 +0900)]
Set pg_class.relhassubclass for partitioned indexes
Like for relations, switching this parameter is optimistic by turning it
on each time a partitioned index gains a partition. So seeing this
parameter set to true means that the partitioned index has or has had
partitions. The flag cannot be reset yet for partitioned indexes, which
is something not obvious anyway as partitioned relations exist to have
partitions.
This allows to track more conveniently partition trees for indexes,
which will come in use with an upcoming patch helping in listing
partition trees with an SQL-callable function.
Author: Amit Langote Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/80306490-b5fc-ea34-4427-f29c52156052@lab.ntt.co.jp
Andrew Dunstan [Sun, 21 Oct 2018 13:00:13 +0000 (09:00 -0400)]
Don't try to test files named with a trailing dot on Windows
The pg_verify_checksums test tries to create files with corrupt data
named "123." and "123_." But on Windows a file name with a trailing dot
is the same as a file without the trailing dot. In the first case this
will create a file with a "valid" name, which causes the test to fail in
an unexpected way, and in the secongd case this will be redandant as the
test already creates a file named "123_".
Andrew Dunstan [Sat, 20 Oct 2018 13:02:36 +0000 (09:02 -0400)]
Lower privilege level of programs calling regression_main
On Windows this mean that the regression tests can now safely and
successfully run as Administrator, which is useful in situations like
Appveyor. Elsewhere it's a no-op.
Backpatch to 9.5 - this is harder in earlier branches and not worth the
trouble.
Tom Lane [Sat, 20 Oct 2018 02:22:57 +0000 (22:22 -0400)]
Client-side fixes for delayed NOTIFY receipt.
PQnotifies() is defined to just process already-read data, not try to read
any more from the socket. (This is a debatable decision, perhaps, but I'm
hesitant to change longstanding library behavior.) The documentation has
long recommended calling PQconsumeInput() before PQnotifies() to ensure
that any already-arrived message would get absorbed and processed.
However, psql did not get that memo, which explains why it's not very
reliable about reporting notifications promptly.
Also, most (not quite all) callers called PQconsumeInput() just once before
a PQnotifies() loop. Taking this recommendation seriously implies that we
should do PQconsumeInput() before each call. This is more important now
that we have "payload" strings in notification messages than it was before;
that increases the probability of having more than one packet's worth
of notify messages. Hence, adjust code as well as documentation examples
to do it like that.
Back-patch to 9.5 to match related server fixes. In principle we could
probably go back further with these changes, but given lack of field
complaints I doubt it's worthwhile.
Tom Lane [Sat, 20 Oct 2018 01:39:21 +0000 (21:39 -0400)]
Server-side fix for delayed NOTIFY and SIGTERM processing.
Commit 4f85fde8e introduced some code that was meant to ensure that we'd
process cancel, die, sinval catchup, and notify interrupts while waiting
for client input. But there was a flaw: it supposed that the process
latch would be set upon arrival at secure_read() if any such interrupt
was pending. In reality, we might well have cleared the process latch
at some earlier point while those flags remained set -- particularly
notifyInterruptPending, which can't be handled as long as we're within
a transaction.
To fix the NOTIFY case, also attempt to process signals (except
ProcDiePending) before trying to read.
Also, if we see that ProcDiePending is set before we read, forcibly set the
process latch to ensure that we will handle that signal promptly if no data
is available. I also made it set the process latch on the way out, in case
there is similar logic elsewhere. (It remains true that we won't service
ProcDiePending here unless we need to wait for input.)
The code for handling ProcDiePending during a write needs those changes,
too.
Also be a little more careful about when to reset whereToSendOutput,
and improve related comments.
Back-patch to 9.5 where this code was added. I'm not entirely convinced
that older branches don't have similar issues, but the complaint at hand
is just about the >= 9.5 code.
Tom Lane [Fri, 19 Oct 2018 23:36:34 +0000 (19:36 -0400)]
Sync our copy of the timezone library with IANA release tzcode2018f.
About half of this is purely cosmetic changes to reduce the diff between
our code and theirs, like inserting "const" markers where they have them.
The other half is tracking actual code changes in zic.c and localtime.c.
I don't think any of these represent near-term compatibility hazards, but
it seems best to stay up to date.
I also fixed longstanding bugs in our code for producing the
known_abbrevs.txt list, which by chance hadn't been exposed before,
but which resulted in some garbage output after applying the upstream
changes in zic.c. Notably, because upstream removed their old phony
transitions at the Big Bang, it's now necessary to cope with TZif files
containing no DST transition times at all.
Tom Lane [Fri, 19 Oct 2018 21:01:34 +0000 (17:01 -0400)]
Update time zone data files to tzdata release 2018f.
DST law changes in Chile, Fiji, and Russia (Volgograd).
Historical corrections for China, Japan, Macau, and North Korea.
Note: like the previous tzdata update, this involves a depressingly
large amount of semantically-meaningless churn in tzdata.zi. That
is a consequence of upstream's data compression method assigning
unstable abbreviations to DST rulesets. I complained about that
to them last time, and this version now uses an assignment method
that pays some heed to not changing abbreviations unnecessarily.
So hopefully, that'll be better going forward.
Michael Paquier [Fri, 19 Oct 2018 13:44:12 +0000 (22:44 +0900)]
Use whitelist to choose files scanned with pg_verify_checksums
The original implementation of pg_verify_checksums used a blacklist to
decide which files should be skipped for scanning as they do not include
data checksums, like pg_internal.init or pg_control. However, this
missed two things:
- Some files are created within builds of EXEC_BACKEND and these were
not listed, causing failures on Windows.
- Extensions may create custom files in data folders, causing the tool
to equally fail.
This commit switches to a whitelist-like method instead by checking if
the files to scan are authorized relation files. This is close to a
reverse-engineering of what is defined in relpath.c in charge of
building the relation paths, and we could consider refactoring what this
patch does so as all routines are in a single place. This is left for
later.
This is based on a suggestion from Andres Freund. TAP tests are updated
so as multiple file patterns are tested. The bug has been spotted by
various buildfarm members as a result of b34e84f which has introduced
the TAP tests of pg_verify_checksums.
Author: Michael Paquier Reviewed-by: Andrew Dunstan, Michael Banck
Discussion: https://postgr.es/m/20181012005614.GC26424@paquier.xyz
Backpatch-through: 11
Thomas Munro [Fri, 19 Oct 2018 00:59:14 +0000 (13:59 +1300)]
Refactor pid, random seed and start time initialization.
Background workers, including parallel workers, were generating
the same sequence of numbers in random(). This showed up as DSM
handle collisions when Parallel Hash created multiple segments,
but any code that calls random() in background workers could be
affected if it cares about different backends generating different
numbers.
Repair by making sure that all new processes initialize the seed
at the same time as they set MyProcPid and MyStartTime in a new
function InitProcessGlobals(), called by the postmaster, its
children and also standalone processes. Also add a new high
resolution MyStartTimestamp as a potentially useful by-product,
and remove SessionStartTime from struct Port as it is now
redundant.
No back-patch for now, as the known consequences so far are just
a bunch of harmless shm_open(O_EXCL) collisions.
Author: Thomas Munro Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAEepm%3D2eJj_6%3DB%2B2tEpGu2nf1BjthCf9nXXUouYvJJ4C5WSwhg%40mail.gmail.com
Tom Lane [Thu, 18 Oct 2018 18:55:23 +0000 (14:55 -0400)]
Still further rethinking of build changes for macOS Mojave.
To avoid the sorts of problems complained of by Jakob Egger, it'd be
best if configure didn't emit any references to the sysroot path at all.
In the case of PL/Tcl, we can do that just by keeping our hands off the
TCL_INCLUDE_SPEC string altogether. In the case of PL/Perl, we need to
substitute -iwithsysroot for -I in the compile commands, which is easily
handled if we change to using a configure output variable that includes
the switch not only the directory name. Since PL/Tcl and PL/Python
already do it like that, this seems like good consistency cleanup anyway.
Hence, this replaces the advice given to Perl-related extensions in commit 5e2217131; instead of writing "-I$(perl_archlibexp)/CORE", they should
just write "$(perl_includespec)". (The old way continues to work, but not
on recent macOS.)
It's still the case that configure needs to be aware of the sysroot
path internally, but that's cleaner than what we had before.
Tom Lane [Wed, 17 Oct 2018 20:41:00 +0000 (16:41 -0400)]
Improve some comments related to executor result relations.
es_leaf_result_relations doesn't exist; perhaps this was an old name
for es_tuple_routing_result_relations, or maybe this comment has gone
unmaintained through multiple rounds of whacking the code around.
Related comment in execnodes.h was both obsolete and ungrammatical.
Tom Lane [Wed, 17 Oct 2018 19:06:38 +0000 (15:06 -0400)]
Fix minor bug in isolationtester.
If the lock wait query failed, isolationtester would report the
PQerrorMessage from some other connection, meaning there would be
no message or an unrelated one. This seems like a pretty unlikely
occurrence, but if it did happen, this bug could make it really
difficult/confusing to figure out what happened. That seems to
justify patching all the way back.
In passing, clean up another place where the "wrong" conn was used
for an error report. That one's not actually buggy because it's
a different alias for the same connection, but it's still confusing
to the reader.
Peter Eisentraut [Wed, 17 Oct 2018 18:31:20 +0000 (20:31 +0200)]
Fix crash in multi-insert COPY
A bug introduced in 0d5f05cde011512e605bb2688d9b1fbb5b3ae152
considered the *previous* partition's triggers when deciding whether
multi-insert can be used. Rearrange the code so that the current
partition is considered.
Tom Lane [Wed, 17 Oct 2018 18:22:33 +0000 (14:22 -0400)]
Minor additional improvements for ecpglib/prepare.c.
Avoid allocating never-used entries in stmtCacheEntries[], other than the
intentionally-unused zero'th entry. Tie the array size directly to the
bucket count and size, rather than having undocumented dependencies between
three magic constants. Fix the hash calculation to be platform-independent
--- notably, it was sensitive to the signed'ness of "char" before, not to
mention having an unnecessary hard-wired dependency on the existence and
size of type "long long". (The lack of complaints says it's been a long
time since anybody tried to build PG on a compiler without "long long",
and certainly with the requirement for C99 this isn't a live bug anymore.
But it's still not per project coding style.) Fix ecpg_auto_prepare's
new-cache-entry path so that it increments the exec count for the new
cache entry not the dummy zero'th entry.
The last of those is an actual bug, though one of little consequence;
the rest is mostly future-proofing and neatnik-ism. Doesn't seem
necessary to back-patch.
Tom Lane [Wed, 17 Oct 2018 16:26:48 +0000 (12:26 -0400)]
Improve tzparse's handling of TZDEFRULES ("posixrules") zone data.
In the IANA timezone code, tzparse() always tries to load the zone
file named by TZDEFRULES ("posixrules"). Previously, we'd hacked
that logic to skip the load in the "lastditch" code path, which we use
only to initialize the default "GMT" zone during GUC initialization.
That's critical for a couple of reasons: since we do not support leap
seconds, we *must not* allow "GMT" to have leap seconds, and since this
case runs before the GUC subsystem is fully alive, we'd really rather
not take the risk of pg_open_tzfile throwing any errors.
However, that still left the code reading TZDEFRULES on every other
call, something we'd noticed to the extent of having added code to cache
the result so it was only done once per process not a lot of times.
Andres Freund complained about the static data space used up for the
cache; but as long as the logic was like this, there was no point in
trying to get rid of that space.
We can improve matters by looking a bit more closely at what the IANA
code actually needs the TZDEFRULES data for. One thing it does is
that if "posixrules" is a leap-second-aware zone, the leap-second
behavior will be absorbed into every POSIX-style zone specification.
However, that's a behavior we'd really prefer to do without, since
for our purposes the end effect is to render every POSIX-style zone
name unsupported. Otherwise, the TZDEFRULES data is used only if
the POSIX zone name specifies DST but doesn't include a transition
date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0").
That is a minority case for our purposes --- in particular, it
never happens when tzload() invokes tzparse() to interpret a
transition date rule string found in a tzdata zone file.
Hence, if we legislate that we're going to ignore leap-second data
from "posixrules", we can postpone the TZDEFRULES load into the path
where we actually need to substitute for a missing date rule string.
That means it will never happen at all in common scenarios, making it
reasonable to dynamically allocate the cache space when it does happen.
Even when the data is already loaded, this saves some cycles in the
common code path since we avoid a memcpy of 23KB or so. And, IMO at
least, this is a less ugly hack on the IANA logic than what we had
before, since it's not messing with the lastditch-vs-regular code paths.
Back-patch to all supported branches, not so much because this is a
critical change as that I want to keep all our copies of the IANA
timezone code in sync.
Tom Lane [Wed, 17 Oct 2018 04:04:48 +0000 (00:04 -0400)]
Avoid statically allocating statement cache in ecpglib/prepare.c.
This removes a megabyte of storage that isn't used at all in ecpglib's
default operating mode --- you have to enable auto-prepare to get any
use out of it. Seems well worth the trouble to allocate on demand.
Tom Lane [Wed, 17 Oct 2018 03:43:15 +0000 (23:43 -0400)]
Formatting cleanup in ecpglib/prepare.c.
Looking at this code made my head hurt. Format the comments more
like the way it's done elsewhere, break a few overly long lines.
No actual code changes in this commit.
Andres Freund [Tue, 16 Oct 2018 21:51:18 +0000 (14:51 -0700)]
Reorder FmgrBuiltin members, saving 25% in size.
That's worth it, as fmgr_builtins is frequently accessed, and as
fmgr_builtins is one of the biggest constant variables in a backend.
On most 64bit systems this will change the size of the struct from
32byte to 24bytes. While that could make indexing into the array
marginally more expensive, the higher cache hit ratio is worth more,
especially because these days fmgr_builtins isn't searched with a
binary search anymore (c.f. 212e6f34d5).
Tom Lane [Tue, 16 Oct 2018 20:27:15 +0000 (16:27 -0400)]
Back off using -isysroot on Darwin.
Rethink the solution applied in commit 5e2217131 to get PL/Tcl to
build on macOS Mojave. I feared that adding -isysroot globally might
have undesirable consequences, and sure enough Jakob Egger reported
one: it complicates building extensions with a different Xcode version
than was used for the core server. (I find that a risky proposition
in general, but apparently it works most of the time, so we shouldn't
break it if we don't have to.)
We'd already adopted the solution for PL/Perl of inserting the sysroot
path directly into the -I switches used to find Perl's headers, and we
can do the same thing for PL/Tcl by changing the -iwithsysroot switch
that Apple's tclConfig.sh reports. This restricts the risks to PL/Perl
and PL/Tcl themselves and directly-dependent extensions, which is a lot
more pleasing in general than a global -isysroot switch.
Along the way, tighten the test to see if we need to inject the sysroot
path into $perl_includedir, as I'd speculated about upthread but not
gotten round to doing.
Andres Freund [Tue, 16 Oct 2018 19:05:50 +0000 (12:05 -0700)]
Mark constantly allocated dest receiver as const.
This allows the compiler / linker to mark affected pages as read-only.
Doing so requires casting constness away, as CreateDestReceiver()
returns both constant and non-constant dest receivers. That's fine
though, as any modification of the statically allocated receivers
would already have been a bug (and would now be caught on some
platforms).
Andres Freund [Tue, 16 Oct 2018 19:05:50 +0000 (12:05 -0700)]
Add macro to cast away const without allowing changes to underlying type.
The new unconsitify(underlying_type, var) macro allows to cast
constness away from a variable, but doesn't allow changing the
underlying type. Enforcement of the latter currently only works for
gcc like compilers.
Please note IT IS NOT SAFE to cast constness away if the variable will ever
be modified (it would be undefined behaviour). Doing so anyway can cause
compiler misoptimizations or runtime crashes (modifying readonly memory).
It is only safe to use when the the variable will not be modified, but API
design or language restrictions prevent you from declaring that
(e.g. because a function returns both const and non-const variables).
This'll be used in an upcoming change, but seems like it's independent
infrastructure.
Author: Andres Freund
Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
Tom Lane [Tue, 16 Oct 2018 18:57:14 +0000 (14:57 -0400)]
Be smarter about age-counter overflow in formatting.c caches.
The previous code here simply threw away whatever it knew about cache
entry ages whenever a counter overflow occurred. Since the counter
is int width and will be bumped once per format function execution,
overflows are not really so rare as to not be worth thinking about.
Instead, let's deal with the situation by halving all the age values,
essentially rescaling the age metric. In that way, we retain a
pretty accurate (if not quite perfect) idea of which entries are oldest.
Tom Lane [Tue, 16 Oct 2018 17:56:58 +0000 (13:56 -0400)]
Avoid rare race condition in privileges.sql regression test.
We created a temp table, then switched to a new session, leaving
the old session to clean up its temp objects in background. If that
took long enough, the eventual attempt to drop the user that owns
the temp table could fail, as exhibited today by sidewinder.
Fix by dropping the temp table explicitly when we're done with it.
It's been like this for quite some time, so back-patch to all
supported branches.
Tom Lane [Tue, 16 Oct 2018 17:11:05 +0000 (13:11 -0400)]
Avoid statically allocating formatting.c's format string caches.
This eliminates circa 120KB of static data from Postgres' memory
footprint. In some usage patterns that space will get allocated
anyway, but in many processes it never will be allocated.
We can improve matters further by allocating only as many cache
entries as we actually use, rather than allocating the whole array
on first use. However, to avoid wasting lots of space due to
palloc's habit of rounding requests up to power-of-2 sizes, tweak
the maximum cacheable format string length to make the struct sizes
be powers of 2 or just less. The sizes I chose make the maximums
a little bit less than they were before, but I doubt it matters much.
While at it, rearrange struct FormatNode to avoid wasting quite so
much padding space. This change actually halves the size of that
struct on 64-bit machines.
Andres Freund [Tue, 16 Oct 2018 16:44:43 +0000 (09:44 -0700)]
Correct constness of system attributes in heap.c & prerequisites.
This allows the compiler / linker to mark affected pages as read-only.
There's a fair number of pre-requisite changes, to allow the const
properly be propagated. Most of consts were already required for
correctness anyway, just not represented on the type-level. Arguably
we could be more aggressive in using consts in related code, but..
This requires using a few of the types underlying typedefs that
removes pointers (e.g. const NameData *) as declaring the typedefed
type constant doesn't have the same meaning (it makes the variable
const, not what it points to).
Tom Lane [Tue, 16 Oct 2018 16:27:14 +0000 (12:27 -0400)]
Make PostgresNode.pm's poll_query_until() more chatty about failures.
Reporting only the stderr is unhelpful when the problem is that the
server output we're getting doesn't match what was expected. So we
should report the query output too; and just for good measure, let's
print the query we used and the output we expected.
Back-patch to 9.5 where poll_query_until was introduced.
Tom Lane [Tue, 16 Oct 2018 16:01:18 +0000 (12:01 -0400)]
Improve stability of recently-added regression test case.
Commit b5febc1d1 added a contrib/btree_gist test case that has been
observed to fail in the buildfarm as a result of background auto-analyze
updating stats and changing the selected plan. Forestall that by
forcibly analyzing in foreground, instead. The new plan choice is
just as good for our purposes, since we really only care that an
index-only plan does not get selected.
localtime.c's "struct state" is a rather large object, ~23KB. We were
statically allocating one for gmtsub() to use to represent the GMT
timezone, even though that function is not at all heavily used and is
never reached in most backends. Let's malloc it on-demand, instead.
This does pose the question of how to handle a malloc failure, but
there's already a well-defined error report convention here, ie
set errno and return NULL.
We have but one caller of pg_gmtime in HEAD, and two in back branches,
neither of which were troubling to check for error. Make them do so.
The possible errors are sufficiently unlikely (out-of-range timestamp,
and now malloc failure) that I think elog() is adequate.
Back-patch to all supported branches to keep our copies of the IANA
timezone code in sync. This particular change is in a stanza that
already differs from upstream, so it's a wash for maintenance purposes
--- but only as long as we keep the branches the same.
Andres Freund [Mon, 15 Oct 2018 22:24:33 +0000 (15:24 -0700)]
Move TupleTableSlots boolean member into one flag variable.
There's several reasons for this change:
1) It reduces the total size of TupleTableSlot / reduces alignment
padding, making the commonly accessed members fit into a single
cacheline (but we currently do not force proper alignment, so
that's not yet guaranteed to be helpful)
2) Combining the booleans into a flag allows to combine read/writes
from memory.
3) With the upcoming slot abstraction changes, it allows to have core
and extended flags, in a memory efficient way.
Author: Ashutosh Bapat and Andres Freund
Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de
Andres Freund [Sun, 14 Oct 2018 22:18:16 +0000 (15:18 -0700)]
Move generic slot support functions from heaptuple.c into execTuples.c.
heaptuple.c was never a particular good fit for slot_getattr(),
slot_getsomeattrs() and slot_getmissingattrs(), but in upcoming
changes slots will be made more abstract (allowing slots that contain
different types of tuples), making it clearly the wrong place.
Note that slot_deform_tuple() remains in it's current place, as it
clearly deals with a HeapTuple. getmissingattrs() also remains, but
it's less clear that that's correct - but execTuples.c wouldn't be the
right place.
Thomas Munro [Mon, 15 Oct 2018 22:04:41 +0000 (11:04 +1300)]
Move the replication lag tracker into heap memory.
Andres Freund complained about the 128KB of .bss occupied by LagTracker.
It's only needed in the walsender process, so allocate it in heap
memory there.
Author: Thomas Munro
Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya%40alap3.anarazel.de
Tom Lane [Mon, 15 Oct 2018 18:01:38 +0000 (14:01 -0400)]
Check for stack overrun in standard_ProcessUtility().
ProcessUtility can recurse, and indeed can be driven to infinite
recursion, so it ought to have a check_stack_depth() call. This
covers the reported bug (portal trying to execute itself) and a bunch
of other cases that could perhaps arise somewhere.
Per bug #15428 from Malthe Borch. Back-patch to all supported branches.
When an error occurs during a benchmark run, exit with a nonzero exit
code and write a message at the end. Previously, it would just print
the error message when it happened but then proceed to print the run
summary normally and exit with status 0. To still allow
distinguishing setup from run-time errors, we use exit status 2 for
the new state, whereas existing errors during pgbench initialization
use exit status 1.
Peter Eisentraut [Mon, 15 Oct 2018 07:48:49 +0000 (09:48 +0200)]
Fixes for "Glyph not available" warnings from FOP
With the PostgreSQL 11 release notes acknowledgments list, FOP reported
WARNING: Glyph "?" (0x144, nacute) not available in font "Times-Roman".
WARNING: Glyph "?" (0x15e, Scedilla) not available in font "Times-Roman".
WARNING: Glyph "?" (0x15f, scedilla) not available in font "Times-Roman".
WARNING: Glyph "?" (0x131, dotlessi) not available in font "Times-Roman".
This is because we have some new contributors whose names use letters
that we haven't used before, and apparently FOP can't handle them out
of the box.
For now, just fix this by "unaccenting" those names. In the future,
maybe this can be fixed better with a different font configuration.
There is also another warning
WARNING: Glyph "?" (0x3c0, pi) not available in font "Times-Roman".
but that existed in previous releases and is not touched here.
This commit documents rounding of "length" parameter and absence of support
for unique indexes and NULLs searching. Backpatch to 9.6 where contrib/bloom
was introduced.
Discussion: https://postgr.es/m/CAF4Au4wPQQ7EHVSnzcLjsbY3oLSzVk6UemZLD1Sbmwysy3R61g%40mail.gmail.com
Author: Oleg Bartunov with minor editorialization by me
Backpatch-through: 9.6
Tom Lane [Sun, 14 Oct 2018 18:02:59 +0000 (14:02 -0400)]
Make some subquery-using test cases a bit more robust.
These test cases could be adversely affected by an upcoming change
to allow pullup of FROM-less subqueries. Tweak them to ensure that
they'll continue to test what they did before.
Tom Lane [Sun, 14 Oct 2018 17:07:29 +0000 (13:07 -0400)]
Use PlaceHolderVars within the quals of a FULL JOIN.
This prevents failures in cases where we pull up a constant or var-free
expression from a subquery and put it into a full join's qual. That can
result in not recognizing the qual as containing a mergejoin-able or
hashjoin-able condition. A PHV prevents the problem because it is still
recognized as belonging to the side of the join the subquery is in.
I'm not very sure about the net effect of this change on plan quality.
In "typical" cases where the join keys are Vars, nothing changes.
In an affected case, the PHV-wrapped expression is less likely to be seen
as equal to PHV-less instances below the join, but more likely to be seen
as equal to similar expressions above the join, so it may end up being a
wash. In the one existing case where there's any visible change in a
regression-test plan, it amounts to referencing a lower computation of a
COALESCE result instead of recomputing it, which seems like a win.
Given my uncertainty about that and the lack of field complaints,
no back-patch, even though this is a very ancient problem.
Tom Lane [Sun, 14 Oct 2018 16:11:16 +0000 (12:11 -0400)]
Clean up/tighten up coercibility checks in opr_sanity regression test.
With the removal of the old abstime type, there are no longer any cases
in this test where we need to use the weaker castcontext-ignoring form
of binary coercibility check. (The other major source of such headaches,
apparently-incompatible hash functions, is now hashvalidate()'s problem
not this test script's problem.) Hence, just use binary_coercible()
everywhere, and remove the comments explaining why we don't do so ---
which were broken anyway by cda6a8d01.
I left physically_coercible() in place but renamed it to better
match what it's actually testing, and added some comments.
Also, in test queries that have an assumption about the maximum number
of function arguments they need to handle, add a clause to make them fail
if someday there's a relevant function with more arguments. Otherwise
we're likely not to notice that we need to extend the queries.
Michael Paquier [Sun, 14 Oct 2018 13:23:21 +0000 (22:23 +0900)]
Avoid duplicate XIDs at recovery when building initial snapshot
On a primary, sets of XLOG_RUNNING_XACTS records are generated on a
periodic basis to allow recovery to build the initial state of
transactions for a hot standby. The set of transaction IDs is created
by scanning all the entries in ProcArray. However it happens that its
logic never counted on the fact that two-phase transactions finishing to
prepare can put ProcArray in a state where there are two entries with
the same transaction ID, one for the initial transaction which gets
cleared when prepare finishes, and a second, dummy, entry to track that
the transaction is still running after prepare finishes. This way
ensures a continuous presence of the transaction so as callers of for
example TransactionIdIsInProgress() are always able to see it as alive.
So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase
transaction finishes to prepare, the record can finish with duplicated
XIDs, which is a state expected by design. If this record gets applied
on a standby to initial its recovery state, then it would simply fail,
so the odds of facing this failure are very low in practice. It would
be tempting to change the generation of XLOG_RUNNING_XACTS so as
duplicates are removed on the source, but this requires to hold on
ProcArrayLock for longer and this would impact all workloads,
particularly those using heavily two-phase transactions.
XLOG_RUNNING_XACTS is also actually used only to initialize the standby
state at recovery, so instead the solution is taken to discard
duplicates when applying the initial snapshot.
Diagnosed-by: Konstantin Knizhnik
Author: Michael Paquier
Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru
Backpatch-through: 9.3
Tom Lane [Sat, 13 Oct 2018 20:31:09 +0000 (16:31 -0400)]
Make an editing pass over v11 release notes.
Set the release date. Do a bunch of copy-editing and markup improvement,
rearrange some stuff into what seemed a more sensible order, move some
things that did not seem to be in the right section.
Tom Lane [Fri, 12 Oct 2018 22:08:47 +0000 (18:08 -0400)]
Another round of portability hacking on ECPG regression tests.
Removing the separate Windows expected-files in commit f1885386f
turns out to have been too optimistic: on most (but not all!) of our
Windows buildfarm members, the tests still print floats with three
exponent digits, because they're invoking the native printf()
not snprintf.c.
But rather than put back the extra expected-files, let's hack
the three tests in question so that they adjust float formatting
the same way snprintf.c does.
Tom Lane [Fri, 12 Oct 2018 18:26:56 +0000 (14:26 -0400)]
Simplify use of AllocSetContextCreate() wrapper macro.
We can allow this macro to accept either abbreviated or non-abbreviated
allocation parameters by making use of __VA_ARGS__. As noted by Andres
Freund, it's unlikely that any compiler would have __builtin_constant_p
but not __VA_ARGS__, so this gives up little or no error checking, and
it avoids a minor but annoying API break for extensions.
With this change, there is no reason for anybody to call
AllocSetContextCreateExtended directly, so in HEAD I renamed it to
AllocSetContextCreateInternal. It's probably too late for an ABI
break like that in 11, though.
Alvaro Herrera [Fri, 12 Oct 2018 15:36:26 +0000 (12:36 -0300)]
Correct attach/detach logic for FKs in partitions
There was no code to handle foreign key constraints on partitioned
tables in the case of ALTER TABLE DETACH; and if you happened to ATTACH
a partition that already had an equivalent constraint, that one was
ignored and a new constraint was created. Adding this to the fact that
foreign key cloning reuses the constraint name on the partition instead
of generating a new name (as it probably should, to cater to SQL
standard rules about constraint naming within schemas), the result was a
pretty poor user experience -- the most visible failure was that just
detaching a partition and re-attaching it failed with an error such as
because it would try to create an identically-named constraint in the
partition. To make matters worse, if you tried to drop the constraint
in the now-independent partition, that would fail because the constraint
was still seen as dependent on the constraint in its former parent
partitioned table:
ERROR: cannot drop inherited constraint "test_result_asset_id_fkey" of relation "test_result_cbsystem_0001_0050_monthly_2018_09"
This fix attacks the problem from two angles: first, when the partition
is detached, the constraint is also marked as independent, so the drop
now works. Second, when the partition is re-attached, we scan existing
constraints searching for one matching the FK in the parent, and if one
exists, we link that one to the parent constraint. So we don't end up
with a duplicate -- and better yet, we don't need to scan the referenced
table to verify that the constraint holds.
To implement this I made a small change to previously planner-only
struct ForeignKeyCacheInfo to contain the constraint OID; also relcache
now maintains the list of FKs for partitioned tables too.
Backpatch to 11.
Reported-by: Michael Vitale (bug #15425)
Discussion: https://postgr.es/m/15425-2dbc9d2aa999f816@postgresql.org
Tom Lane [Fri, 12 Oct 2018 15:14:27 +0000 (11:14 -0400)]
Make float exponent output on Windows look the same as elsewhere.
Windows, alone among our supported platforms, likes to emit three-digit
exponent fields even when two digits would do. Adjust such results to
look like the way everyone else does it. Eliminate a bunch of variant
expected-output files that were needed only because of this quirk.
Michael Paquier [Fri, 12 Oct 2018 00:12:31 +0000 (09:12 +0900)]
Add TAP tests for pg_verify_checksums
All options available in the utility get coverage:
- Tests with disabled page checksums.
- Tests with enabled test checksums.
- Emulation of corruption and broken checksums with a full scan and
single relfilenode scan.
This patch has been contributed mainly by Michael Banck and Magnus
Hagander with things presented on various threads, and I have gathered
all the contents into a single patch.
Author: Michael Banck, Magnus Hagander, Michael Paquier Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/20181005012645.GE1629@paquier.xyz
Andres Freund [Fri, 28 Sep 2018 22:13:42 +0000 (15:13 -0700)]
Remove timetravel extension.
The extension depended on old types which are about to be removed. As
the code additionally was pretty crufty and didn't provide much in the
way of functionality, removing the extension seems to be the best way
forward. It's fairly trivial to write functionality in plpgsql that
more than covers what timetravel did.
Author: Andres Freund
Discussion:
https://postgr.es/m/20171213080506.cwjkpcz3bkk6yz2u@alap3.anarazel.de
https://postgr.es/m/25615.1513115237@sss.pgh.pa.us
Andres Freund [Thu, 11 Oct 2018 18:30:59 +0000 (11:30 -0700)]
Move timeofday() implementation out of nabstime.c.
nabstime.c is about to be removed, but timeofday() isn't related to
the rest of the functionality therein, and some find it useful. Move
to timestamp.c.
Andres Freund [Wed, 10 Oct 2018 20:53:02 +0000 (13:53 -0700)]
Fix logical decoding error when system table w/ toast is repeatedly rewritten.
Repeatedly rewriting a mapped catalog table with VACUUM FULL or
CLUSTER could cause logical decoding to fail with:
ERROR, "could not map filenode \"%s\" to relation OID"
To trigger the problem the rewritten catalog had to have live tuples
with toasted columns.
The problem was triggered as during catalog table rewrites the
heap_insert() check that prevents logical decoding information to be
emitted for system catalogs, failed to treat the new heap's toast table
as a system catalog (because the new heap is not recognized as a
catalog table via RelationIsLogicallyLogged()). The relmapper, in
contrast to the normal catalog contents, does not contain historical
information. After a single rewrite of a mapped table the new relation
is known to the relmapper, but if the table is rewritten twice before
logical decoding occurs, the relfilenode cannot be mapped to a
relation anymore. Which then leads us to error out. This only
happens for toast tables, because the main table contents aren't
re-inserted with heap_insert().
The fix is simple, add a new heap_insert() flag that prevents logical
decoding information from being emitted, and accept during decoding
that there might not be tuple data for toast tables.
Unfortunately that does not fix pre-existing logical decoding
errors. Doing so would require not throwing an error when a filenode
cannot be mapped to a relation during decoding, and that seems too
likely to hide bugs. If it's crucial to fix decoding for an existing
slot, temporarily changing the ERROR in ReorderBufferCommit() to a
WARNING appears to be the best fix.
Author: Andres Freund
Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de
Backpatch: 9.4-, where logical decoding was introduced
Andres Freund [Wed, 10 Oct 2018 20:53:02 +0000 (13:53 -0700)]
Force synchronous commit to be enabled for all test_decoding tests.
Without that the tests fail when forced to be run against a cluster
with synchronous_commit = off (as the WAL might not yet be flushed to
disk by the point logical decoding gets called, and thus the expected
output breaks). Most tests already do that, add it to a few newer tests.
The previous check for a "complete query" omitted the new
PROCESS_UTILITY_QUERY_NONATOMIC value. This didn't actually make a
difference in practice, because only CALL and SET from PL/pgSQL run in
this state, but it's more correct to include it anyway.
It was previously a string setting that was converted into an enum by
custom code, but using the GUC enum facility seems much simpler and
doesn't change any functionality, except that
set transaction_isolation='default';
no longer works, but that was never documented and doesn't work with
any other transaction characteristics. (Note that this is not the
same as RESET or SET TO DEFAULT, which still work.)
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/457db615-e84c-4838-310e-43841eb806e5@iki.fi
Tom Lane [Tue, 9 Oct 2018 17:36:16 +0000 (13:36 -0400)]
Make src/common/exec.c's error logging less ugly.
This code used elog where it really ought to use ereport, mainly so that
it can report a SQLSTATE different from ERRCODE_INTERNAL_ERROR. There
were some other random deviations from typical error report practice too.
In addition, we can make some cleanups that were impractical six months
ago:
* Use one variadic macro, instead of several with different numbers
of arguments, reducing the temptation to force-fit messages into
particular numbers of arguments;
* Use %m, even in the frontend case, simplifying the code.
Tom Lane [Tue, 9 Oct 2018 15:31:30 +0000 (11:31 -0400)]
Remove no-longer-needed variant expected regression result files.
numerology_1.out and float8-small-is-zero_1.out differ from their
base files only in showing plain zero rather than minus zero for
some results. I believe that in the wake of commit 6eb3eb577,
we will print minus zero as such on all IEEE-float platforms
(and non-IEEE floats are going to cause many more regression diffs
than this, anyway). Hence we should be able to remove these and
eliminate a bit of maintenance pain. Let's see if the buildfarm
agrees.
Tom Lane [Tue, 9 Oct 2018 15:10:07 +0000 (11:10 -0400)]
Select appropriate PG_PRINTF_ATTRIBUTE for recent NetBSD.
NetBSD-current generates a large number of warnings about "%m" not
being appropriate to use with *printf functions. While that's true
for their native printf, it's surely not true for snprintf.c, so I
think they have misunderstood gcc's definition of the "gnu_printf"
archetype. Nonetheless, choosing "__syslog__" instead silences the
warnings; so teach configure about that.
Since this is only a cosmetic warning issue (and anyway it depends
on previous hacking to be self-consistent), no back-patch.
Michael Paquier [Tue, 9 Oct 2018 13:29:09 +0000 (22:29 +0900)]
Add pg_ls_archive_statusdir function
This function lists the contents of the WAL archive status directory,
and is intended to be used by monitoring tools. Unlike pg_ls_dir(),
access to it can be granted to non-superusers so that those monitoring
tools can observe the principle of least privilege. Access is also
given by default to members of pg_monitor.
Author: Christoph Moench-Tegeder Reviewed-by: Aya Iwata
Discussion: https://postgr.es/m/20180930205920.GA64534@elch.exwg.net
Tom Lane [Tue, 9 Oct 2018 04:04:27 +0000 (00:04 -0400)]
Convert some long lists in configure.in to one-line-per-entry style.
The idea here is that patches that add items to these lists will often
be easier to rebase over other additions to the same lists, because
they won't be trying to touch the very same line of configure.in.
There will still be merge conflicts in the configure script, but that
can be fixed just by re-running autoconf (or by leaving configure out
of the submitted patch to begin with ...)
Implementation note: use of m4_normalize() is necessary to get rid of
the newlines, else incorrect shell syntax will be emitted. But with
that hack, the generated configure script is identical to what it
was before.
Thomas Munro [Mon, 8 Oct 2018 23:51:01 +0000 (12:51 +1300)]
Relax transactional restrictions on ALTER TYPE ... ADD VALUE (redux).
Originally committed as 15bc038f (plus some follow-ups), this was
reverted in 28e07270 due to a problem discovered in parallel
workers. This new version corrects that problem by sending the
list of uncommitted enum values to parallel workers.
Here follows the original commit message describing the change:
To prevent possibly breaking indexes on enum columns, we must keep
uncommitted enum values from getting stored in tables, unless we
can be sure that any such column is new in the current transaction.
Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE
from being executed at all in a transaction block, unless the target
enum type had been created in the current transaction. This patch
removes that restriction, and instead insists that an uncommitted enum
value can't be referenced unless it belongs to an enum type created
in the same transaction as the value. Per discussion, this should be
a bit less onerous. It does require each function that could possibly
return a new enum value to SQL operations to check this restriction,
but there aren't so many of those that this seems unmaintainable.
Author: Andrew Dunstan and Tom Lane, with parallel query fix by Thomas Munro Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAEepm%3D0Ei7g6PaNTbcmAh9tCRahQrk%3Dr5ZWLD-jr7hXweYX3yg%40mail.gmail.com
Discussion: https://postgr.es/m/4075.1459088427%40sss.pgh.pa.us
Tom Lane [Mon, 8 Oct 2018 23:15:55 +0000 (19:15 -0400)]
Fix omissions in snprintf.c's coverage of standard *printf functions.
A warning on a NetBSD box revealed to me that pg_waldump/compat.c
is using vprintf(), which snprintf.c did not provide coverage for.
This is not good if we want to have uniform *printf behavior, and
it's pretty silly to omit when it's a one-line function.
I also noted that snprintf.c has pg_vsprintf() but for some reason
it was not exposed to the outside world, creating another way in
which code might accidentally invoke the platform *printf family.
Let's just make sure that we replace all eight of the POSIX-standard
printf family.
Also, upgrade plperl.h and plpython.h to make sure that they do
their undefine/redefine rain dance for all eight, not some random
maybe-sufficient subset thereof.
Tom Lane [Mon, 8 Oct 2018 16:19:20 +0000 (12:19 -0400)]
Improve snprintf.c's handling of NaN, Infinity, and minus zero.
Up to now, float4out/float8out handled NaN and Infinity cases explicitly,
and invoked psprintf only for ordinary float values. This was done because
platform implementations of snprintf produce varying representations of
these special cases. But now that we use snprintf.c always, it's better
to give it the responsibility to produce a uniform representation of
these cases, so that we have uniformity across the board not only in
float4out/float8out. Hence, move that work into fmtfloat().
Also, teach fmtfloat() to recognize IEEE minus zero and handle it
correctly. The previous coding worked only accidentally, and would
fail for e.g. "%+f" format (it'd print "+-0.00000"). Now that we're
using snprintf.c everywhere, it's not acceptable for it to do weird
things in corner cases. (This incidentally avoids a portability
problem we've seen on some really ancient platforms, that native
sprintf does the wrong thing with minus zero.)
Also, introduce a new entry point in snprintf.c to allow float[48]out
to bypass the work of interpreting a well-known format spec, as well
as bypassing the overhead of the psprintf layer. I modeled this API
loosely on strfromd(). In my testing, this brings float[48]out back
to approximately the same speed they had when using native snprintf,
fixing one of the main performance issues caused by using snprintf.c.
(There is some talk of more aggressive work to improve the speed of
floating-point output conversion, but these changes seem to provide
a better starting point for such work anyway.)
Getting rid of the previous ad-hoc hack for Infinity/NaN in fmtfloat()
allows removing <ctype.h> from snprintf.c's #includes. I also removed
a few other #includes that I think are historical, though the buildfarm
may expose that as wrong.
Tom Lane [Mon, 8 Oct 2018 14:41:34 +0000 (10:41 -0400)]
Avoid O(N^2) cost in ExecFindRowMark().
If there are many ExecRowMark structs, we spent O(N^2) time in
ExecFindRowMark during executor startup. Once upon a time this was
not of great concern, but the addition of native partitioning has
squeezed out enough other costs that this can become the dominant
overhead in some use-cases for tables with many partitions.
To fix, simply replace that List data structure with an array.
This adds a little bit of cost to execCurrentOf(), but not much,
and anyway that code path is neither of large importance nor very
efficient now. If we ever decide it is a bottleneck, constructing a
hash table for lookup-by-tableoid would likely be the thing to do.
Per complaint from Amit Langote, though this is different from
his fix proposal.
Alvaro Herrera [Mon, 8 Oct 2018 13:37:21 +0000 (10:37 -0300)]
Silence compiler warning in Assert()
gcc 6.3 does not whine about this mistake I made in 39808e8868c8 but
evidently lots of other compilers do, according to Michael Paquier,
Peter Eisentraut, Arthur Zakirov, Tomas Vondra.
Michael Paquier [Mon, 8 Oct 2018 08:56:13 +0000 (17:56 +0900)]
Improve two error messages related to foreign keys on partitioned tables
Error messages for creating a foreign key on a partitioned table using
ONLY or NOT VALID were wrong in mentioning the objects they worked on.
This commit adds on the way some regression tests missing for those
cases.
Tom Lane [Sun, 7 Oct 2018 18:33:17 +0000 (14:33 -0400)]
Remove some unnecessary fields from Plan trees.
In the wake of commit f2343653f, we no longer need some fields that
were used before to control executor lock acquisitions:
* PlannedStmt.nonleafResultRelations can go away entirely.
* partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
However, ModifyTable still needs to know the RT index of the partition
root table if any, which was formerly kept in the first entry of that
list. Add a new field "rootRelation" to remember that. rootRelation is
partly redundant with nominalRelation, in that if it's set it will have
the same value as nominalRelation. However, the latter field has a
different purpose so it seems best to keep them distinct.
Amit Langote, reviewed by David Rowley and Jesper Pedersen,
and whacked around a bit more by me
Alvaro Herrera [Sun, 7 Oct 2018 01:13:19 +0000 (22:13 -0300)]
Fix catalog insertion order for ATTACH PARTITION
Commit 2fbdf1b38bc changed the order in which we inserted catalog rows
when creating partitions, so that we could remove an unsightly hack
required for untimely relcache invalidations. However, that commit only
changed the ordering for CREATE TABLE PARTITION OF, and left ALTER TABLE
ATTACH PARTITION unchanged, so the latter can be affected when catalog
invalidations occur, for instance when the partition key involves an SQL
function.