Andres Freund [Sun, 27 Mar 2016 15:47:46 +0000 (17:47 +0200)]
Change various Gin*Is* macros to return 0/1.
Returning the direct result of bit arithmetic, in a macro intended to be
used in a boolean manner, can be problematic if the return value is
stored in a variable of type 'bool'. If bool is implemented using C99's
_Bool, that can lead to comparison failures if the variable is then
compared again with the expression (see ginStepRight() for an example
that fails), as _Bool forces the result to be 0/1. That happens in some
configurations of newer MSVC compilers. It's also problematic when
storing the result of such an expression in a narrower type.
Several gin macros have been declared in that style since gin's initial
commit in 8a3631f8d86.
There's a lot more macros like this, but this is the only one causing
regression test failures; and I don't want to commit and backpatch a
larger patch with lots of conflicts just before the next set of minor
releases.
Discussion: 20150811154237.GD17575@awork2.anarazel.de
Backpatch: All supported branches
Tom Lane [Sat, 26 Mar 2016 19:58:44 +0000 (15:58 -0400)]
Modernize zic's test for valid timezone abbreviations.
We really need to sync all of our IANA-derived timezone code with upstream,
but that's going to be a large patch and I certainly don't care to shove
such a thing into stable branches immediately before a release. As a
stopgap, copy just the tzcode2016c logic that checks validity of timezone
abbreviations. This prevents getting multiple "time zone abbreviation
differs from POSIX standard" bleats with tzdata 2014b and later.
Tom Lane [Fri, 25 Mar 2016 23:03:08 +0000 (19:03 -0400)]
Update time zone data files to tzdata release 2016c.
DST law changes in Azerbaijan, Chile, Haiti, Palestine, and Russia (Altai,
Astrakhan, Kirov, Sakhalin, Ulyanovsk regions). Historical corrections
for Lithuania, Moldova, Russia (Kaliningrad, Samara, Volgograd).
As of 2015b, the keepers of the IANA timezone database started to use
numeric time zone abbreviations (e.g., "+04") instead of inventing
abbreviations not found in the wild like "ASTT". This causes our rather
old copy of zic to whine "warning: time zone abbreviation differs from
POSIX standard" several times during "make install". This warning is
harmless according to the IANA folk, and I don't see any problems with
these abbreviations in some simple tests; but it seems like now would be
a good time to update our copy of the tzcode stuff. I'll look into that
soon.
Andrew Dunstan [Sat, 19 Mar 2016 22:59:41 +0000 (18:59 -0400)]
Remove dependency on psed for MSVC builds.
Modern Perl has removed psed from its core distribution, so it might not
be readily available on some build platforms. We therefore replace its
use with a Perl script generated by s2p, which is equivalent to the sed
script. The latter is retained for non-MSVC builds to avoid creating a
new hard dependency on Perl for non-Windows tarball builds.
Tom Lane [Thu, 17 Mar 2016 03:18:08 +0000 (23:18 -0400)]
Fix "pg_bench -C -M prepared".
This didn't work because when we dropped and re-established a database
connection, we did not bother to reset session-specific state such as
the statements-are-prepared flags.
The st->prepared[] array certainly needs to be flushed, and I cleared a
couple of other fields as well that couldn't possibly retain meaningful
state for a new connection.
In passing, fix some bogus comments and strange field order choices.
Tom Lane [Tue, 15 Mar 2016 17:19:58 +0000 (13:19 -0400)]
Cope if platform declares mbstowcs_l(), but not locale_t, in <xlocale.h>.
Previously, we included <xlocale.h> only if necessary to get the definition
of type locale_t. According to notes in PGAC_TYPE_LOCALE_T, this is
important because on some versions of glibc that file supplies an
incompatible declaration of locale_t. (This info may be obsolete, because
on my RHEL6 box that seems to be the *only* definition of locale_t; but
there may still be glibc's in the wild for which it's a live concern.)
It turns out though that on FreeBSD and maybe other BSDen, you can get
locale_t from stdlib.h or locale.h but mbstowcs_l() and friends only from
<xlocale.h>. This was leaving us compiling calls to mbstowcs_l() and
friends with no visible prototype, which causes a warning and could
possibly cause actual trouble, since it's not declared to return int.
Hence, adjust the configure checks so that we'll include <xlocale.h>
either if it's necessary to get type locale_t or if it's necessary to
get a declaration of mbstowcs_l().
Report and patch by Aleksander Alekseev, somewhat whacked around by me.
Back-patch to all supported branches, since we have been using
mbstowcs_l() since 9.1.
Tom Lane [Mon, 14 Mar 2016 15:31:22 +0000 (11:31 -0400)]
Add missing NULL terminator to list_SECURITY_LABEL_preposition[].
On the machines I tried this on, pressing TAB after SECURITY LABEL led to
being offered ON and FOR as intended, plus random other keywords (varying
across machines). But if you were a bit more unlucky you'd get a crash,
as reported by nummervet@mail.ru in bug #14019.
Seems to have been an aboriginal error in the SECURITY LABEL patch,
commit 4d355a8336e0f226. Hence, back-patch to all supported versions.
There's no bug in HEAD, though, thanks to our recent tab-completion
rewrite.
Magnus Hagander [Thu, 10 Mar 2016 12:48:58 +0000 (13:48 +0100)]
Avoid crash on old Windows with AVX2-capable CPU for VS2013 builds
The Visual Studio 2013 CRT generates invalid code when it makes a 64-bit
build that is later used on a CPU that supports AVX2 instructions using a
version of Windows before 7SP1/2008R2SP1.
Detect this combination, and in those cases turn off the generation of
FMA3, per recommendation from the Visual Studio team.
The bug is actually in the CRT shipping with Visual Studio 2013, but
Microsoft have stated they're only fixing it in newer major versions.
The fix is therefor conditioned specifically on being built with this
version of Visual Studio, and not previous or later versions.
Andres Freund [Thu, 10 Mar 2016 02:53:54 +0000 (18:53 -0800)]
Avoid unlikely data-loss scenarios due to rename() without fsync.
Renaming a file using rename(2) is not guaranteed to be durable in face
of crashes. Use the previously added durable_rename()/durable_link_or_rename()
in various places where we previously just renamed files.
Most of the changed call sites are arguably not critical, but it seems
better to err on the side of too much durability. The most prominent
known case where the previously missing fsyncs could cause data loss is
crashes at the end of a checkpoint. After the actual checkpoint has been
performed, old WAL files are recycled. When they're filled, their
contents are fdatasynced, but we did not fsync the containing
directory. An OS/hardware crash in an unfortunate moment could then end
up leaving that file with its old name, but new content; WAL replay
would thus not replay it.
Reported-By: Tomas Vondra
Author: Michael Paquier, Tomas Vondra, Andres Freund
Discussion: 56583BDD.9060302@2ndquadrant.com
Backpatch: All supported branches
Andres Freund [Thu, 10 Mar 2016 02:53:54 +0000 (18:53 -0800)]
Introduce durable_rename() and durable_link_or_rename().
Renaming a file using rename(2) is not guaranteed to be durable in face
of crashes; especially on filesystems like xfs and ext4 when mounted
with data=writeback. To be certain that a rename() atomically replaces
the previous file contents in the face of crashes and different
filesystems, one has to fsync the old filename, rename the file, fsync
the new filename, fsync the containing directory. This sequence is not
generally adhered to currently; which exposes us to data loss risks. To
avoid having to repeat this arduous sequence, introduce
durable_rename(), which wraps all that.
Also add durable_link_or_rename(). Several places use link() (with a
fallback to rename()) to rename a file, trying to avoid replacing the
target file out of paranoia. Some of those rename sequences need to be
durable as well. There seems little reason extend several copies of the
same logic, so centralize the link() callers.
This commit does not yet make use of the new functions; they're used in
a followup commit.
Author: Michael Paquier, Andres Freund
Discussion: 56583BDD.9060302@2ndquadrant.com
Backpatch: All supported branches
Tom Lane [Wed, 9 Mar 2016 19:51:02 +0000 (14:51 -0500)]
Fix incorrect handling of NULL index entries in indexed ROW() comparisons.
An index search using a row comparison such as ROW(a, b) > ROW('x', 'y')
would stop upon reaching a NULL entry in the "b" column, ignoring the
fact that there might be non-NULL "b" values associated with later values
of "a". This happens because _bt_mark_scankey_required() marks the
subsidiary scankey for "b" as required, which is just wrong: it's for
a column after the one with the first inequality key (namely "a"), and
thus can't be considered a required match.
This bit of brain fade dates back to the very beginnings of our support
for indexed ROW() comparisons, in 2006. Kind of astonishing that no one
came across it before Glen Takahashi, in bug #14010.
Back-patch to all supported versions.
Note: the given test case doesn't actually fail in unpatched 9.1, evidently
because the fix for bug #6278 (i.e., stopping at nulls in either scan
direction) is required to make it fail. I'm sure I could devise a case
that fails in 9.1 as well, perhaps with something involving making a cursor
back up; but it doesn't seem worth the trouble.
Andres Freund [Tue, 8 Mar 2016 22:59:29 +0000 (14:59 -0800)]
ltree: Zero padding bytes when allocating memory for externally visible data.
ltree/ltree_gist/ltxtquery's headers stores data at MAXALIGN alignment,
requiring some padding bytes. So far we left these uninitialized. Zero
those by using palloc0.
Author: Andres Freund Reported-By: Andres Freund / valgrind / buildarm animal skink
Backpatch: 9.1-
Andres Freund [Tue, 8 Mar 2016 21:33:24 +0000 (13:33 -0800)]
plperl: Correctly handle empty arrays in plperl_ref_from_pg_array.
plperl_ref_from_pg_array() didn't consider the case that postgrs arrays
can have 0 dimensions (when they're empty) and accessed the first
dimension without a check. Fix that by special casing the empty array
case.
Author: Alex Hunsaker Reported-By: Andres Freund / valgrind / buildfarm animal skink
Discussion: 20160308063240.usnzg6bsbjrne667@alap3.anarazel.de
Backpatch: 9.1-
Tom Lane [Mon, 7 Mar 2016 15:40:44 +0000 (10:40 -0500)]
Fix backwards test for Windows service-ness in pg_ctl.
A thinko in a96761391 caused pg_ctl to get it exactly backwards when
deciding whether to report problems to the Windows eventlog or to stderr.
Per bug #14001 from Manuel Mathar, who also identified the fix.
Like the previous patch, back-patch to all supported branches.
Tom Lane [Mon, 7 Mar 2016 00:21:03 +0000 (19:21 -0500)]
Fix not-terribly-safe coding in NIImportOOAffixes() and NIImportAffixes().
There were two places in spell.c that supposed that they could search
for a location in a string produced by lowerstr() and then transpose
the offset into the original string. But this fails completely if
lowerstr() transforms any characters into characters of different byte
length, as can happen in Turkish UTF8 for instance.
We'd added some comments about this coding in commit 51e78ab4ff328296,
but failed to realize that it was not merely confusing but wrong.
Coverity complained about this code years ago, but in such an opaque
fashion that nobody understood what it was on about. I'm not entirely
sure that this issue *is* what it's on about, actually, but perhaps
this patch will shut it up -- and in any case the problem is clear.
Robert Haas [Fri, 4 Mar 2016 16:53:20 +0000 (11:53 -0500)]
Fix query-based tab completion for multibyte characters.
The existing code confuses the byte length of the string (which is
relevant when passing it to pg_strncasecmp) with the character length
of the string (which is relevant when it is used with the SQL substring
function). Separate those two concepts.
Report and patch by Kyotaro Horiguchi, reviewed by Thomas Munro and
reviewed and further revised by me.
Tom Lane [Tue, 1 Mar 2016 00:11:38 +0000 (19:11 -0500)]
Improve error message for rejecting RETURNING clauses with dropped columns.
This error message was written with only ON SELECT rules in mind, but since
then we also made RETURNING-clause targetlists go through the same logic.
This means that you got a rather off-topic error message if you tried to
add a rule with RETURNING to a table having dropped columns. Ideally we'd
just support that, but some preliminary investigation says that it might be
a significant amount of work. Seeing that Nicklas Avén's complaint is the
first one we've gotten about this in the ten years or so that the code's
been like that, I'm unwilling to put much time into it. Instead, improve
the error report by issuing a different message for RETURNING cases, and
revise the associated comment based on this investigation.
Tom Lane [Mon, 29 Feb 2016 04:39:20 +0000 (23:39 -0500)]
Avoid multiple free_struct_lconv() calls on same data.
A failure partway through PGLC_localeconv() led to a situation where
the next call would call free_struct_lconv() a second time, leading
to free() on already-freed strings, typically leading to a core dump.
Add a flag to remember whether we need to do that.
Per report from Thom Brown. His example case only provokes the failure
as far back as 9.4, but nonetheless this code is obviously broken, so
back-patch to all supported branches.
Simon Riggs [Fri, 19 Feb 2016 08:35:02 +0000 (08:35 +0000)]
Correct StartupSUBTRANS for page wraparound
StartupSUBTRANS() incorrectly handled cases near the max pageid in the subtrans
data structure, which in some cases could lead to errors in startup for Hot
Standby.
This patch wraps the pageids correctly, avoiding any such errors.
Identified by exhaustive crash testing by Jeff Janes.
Tom Lane [Thu, 18 Feb 2016 20:40:36 +0000 (15:40 -0500)]
Fix multiple bugs in contrib/pgstattuple's pgstatindex() function.
Dead or half-dead index leaf pages were incorrectly reported as live, as a
consequence of a code rearrangement I made (during a moment of severe brain
fade, evidently) in commit d287818eb514d431.
The index metapage was not counted in index_size, causing that result to
not agree with the actual index size on-disk.
Index root pages were not counted in internal_pages, which is inconsistent
compared to the case of a root that's also a leaf (one-page index), where
the root would be counted in leaf_pages. Aside from that inconsistency,
this could lead to additional transient discrepancies between the reported
page counts and index_size, since it's possible for pgstatindex's scan to
see zero or multiple pages marked as BTP_ROOT, if the root moves due to
a split during the scan. With these fixes, index_size will always be
exactly one page more than the sum of the displayed page counts.
Also, the index_size result was incorrectly documented as being measured in
pages; it's always been measured in bytes. (While fixing that, I couldn't
resist doing some small additional wordsmithing on the pgstattuple docs.)
Including the metapage causes the reported index_size to not be zero for
an empty index. To preserve the desired property that the pgstattuple
regression test results are platform-independent (ie, BLCKSZ configuration
independent), scale the index_size result in the regression tests.
The documentation issue was reported by Otsuka Kenji, and the inconsistent
root page counting by Peter Geoghegan; the other problems noted by me.
Back-patch to all supported branches, because this has been broken for
a long time.
Tom Lane [Wed, 17 Feb 2016 02:08:15 +0000 (21:08 -0500)]
Make plpython cope with funny characters in function names.
A function name that's double-quoted in SQL can contain almost any
characters, but we were using that name directly as part of the name
generated for the Python-level function, and Python doesn't like
anything that isn't pretty much a standard identifier. To fix,
replace anything that isn't an ASCII letter or digit with an underscore
in the generated name. This doesn't create any risk of duplicate Python
function names because we were already appending the function OID to
the generated name to ensure uniqueness. Per bug #13960 from Jim Nasby.
Patch by Jim Nasby, modified a bit by me. Back-patch to all
supported branches.
Tom Lane [Tue, 16 Feb 2016 18:43:03 +0000 (13:43 -0500)]
Improve documentation about CREATE INDEX CONCURRENTLY.
Clarify the description of which transactions will block a CREATE INDEX
CONCURRENTLY command from proceeding, and mention that the index might
still not be usable after CREATE INDEX completes. (This happens if the
index build detected broken HOT chains, so that pg_index.indcheckxmin gets
set, and there are open old transactions preventing the xmin horizon from
advancing past the index's initial creation. I didn't want to explain what
broken HOT chains are, though, so I omitted an explanation of exactly when
old transactions prevent the index from being used.)
Per discussion with Chris Travers. Back-patch to all supported branches,
since the same text appears in all of them.
Alvaro Herrera [Mon, 15 Feb 2016 23:33:43 +0000 (20:33 -0300)]
pgbench: avoid FD_ISSET on an invalid file descriptor
The original code wasn't careful to test the file descriptor returned by
PQsocket() for an invalid socket. If an invalid socket did turn up,
that would amount to calling FD_ISSET with fd = -1, whereby undefined
behavior can be invoked.
To fix, test file descriptor for validity and stop further processing if
that fails.
Problem noticed by Coverity.
There is an existing FD_ISSET callsite that does check for invalid
sockets beforehand, but the error message reported by it was
strerror(errno); in testing the aforementioned change, that turns out to
result in "bad socket: Success" which isn't terribly helpful. Instead
use PQerrorMessage() in both places which is more likely to contain an
useful error message.
Tom Lane [Mon, 15 Feb 2016 22:11:52 +0000 (17:11 -0500)]
Suppress compiler warnings about useless comparison of unsigned to zero.
Reportedly, some compilers warn about tests like "c < 0" if c is unsigned,
and hence complain about the character range checks I added in commit 3bb3f42f3749d40b8d4de65871e8d828b18d4a45. This is a bit of a pain since
the regex library doesn't really want to assume that chr is unsigned.
However, since any such reconfiguration would involve manual edits of
regcustom.h anyway, we can put it on the shoulders of whoever wants to
do that to adjust this new range-checking macro correctly.
Noah Misch [Thu, 11 Feb 2016 01:34:02 +0000 (20:34 -0500)]
Accept pg_ctl timeout from the PGCTLTIMEOUT environment variable.
Many automated test suites call pg_ctl. Buildfarm members axolotl,
hornet, mandrill, shearwater, sungazer and tern have failed when server
shutdown took longer than the pg_ctl default 60s timeout. This addition
permits slow hosts to easily raise the timeout without us editing a
--timeout argument into every test suite pg_ctl call. Back-patch to 9.1
(all supported versions) for the sake of automated testing.
Tom Lane [Thu, 11 Feb 2016 00:30:12 +0000 (19:30 -0500)]
Avoid use of sscanf() to parse ispell dictionary files.
It turns out that on FreeBSD-derived platforms (including OS X), the
*scanf() family of functions is pretty much brain-dead about multibyte
characters. In particular it will apply isspace() to individual bytes
of input even when those bytes are part of a multibyte character, thus
allowing false recognition of a field-terminating space.
We appear to have little alternative other than instituting a coding
rule that *scanf() is not to be used if the input string might contain
multibyte characters. (There was some discussion of relying on "%ls",
but that probably just moves the portability problem somewhere else,
and besides it doesn't fully prevent BSD *scanf() from using isspace().)
This patch is a down payment on that: it gets rid of use of sscanf()
to parse ispell dictionary files, which are certainly at great risk
of having a problem. The code is cleaner this way anyway, though
a bit longer.
In passing, improve a few comments.
Report and patch by Artur Zakirov, reviewed and somewhat tweaked by me.
Back-patch to all supported branches.
Tom Lane [Mon, 8 Feb 2016 15:25:40 +0000 (10:25 -0500)]
Fix some regex issues with out-of-range characters and large char ranges.
Previously, our regex code defined CHR_MAX as 0xfffffffe, which is a
bad choice because it is outside the range of type "celt" (int32).
Characters approaching that limit could lead to infinite loops in logic
such as "for (c = a; c <= b; c++)" where c is of type celt but the
range bounds are chr. Such loops will work safely only if CHR_MAX+1
is representable in celt, since c must advance to beyond b before the
loop will exit.
Fortunately, there seems no reason not to restrict CHR_MAX to 0x7ffffffe.
It's highly unlikely that Unicode will ever assign codes that high, and
none of our other backend encodings need characters beyond that either.
In addition to modifying the macro, we have to explicitly enforce character
range restrictions on the values of \u, \U, and \x escape sequences, else
the limit is trivially bypassed.
Also, the code for expanding case-independent character ranges in bracket
expressions had a potential integer overflow in its calculation of the
number of characters it could generate, which could lead to allocating too
small a character vector and then overwriting memory. An attacker with the
ability to supply arbitrary regex patterns could easily cause transient DOS
via server crashes, and the possibility for privilege escalation has not
been ruled out.
Quite aside from the integer-overflow problem, the range expansion code was
unnecessarily inefficient in that it always produced a result consisting of
individual characters, abandoning the knowledge that we had a range to
start with. If the input range is large, this requires excessive memory.
Change it so that the original range is reported as-is, and then we add on
any case-equivalent characters that are outside that range. With this
approach, we can bound the number of individual characters allowed without
sacrificing much. This patch allows at most 100000 individual characters,
which I believe to be more than the number of case pairs existing in
Unicode, so that the restriction will never be hit in practice.
It's still possible for range() to take awhile given a large character code
range, so also add statement-cancel detection to its loop. The downstream
function dovec() also lacked cancel detection, and could take a long time
given a large output from range().
Per fuzz testing by Greg Stark. Back-patch to all supported branches.
Tom Lane [Sun, 7 Feb 2016 21:02:44 +0000 (16:02 -0500)]
Improve documentation about PRIMARY KEY constraints.
Get rid of the false implication that PRIMARY KEY is exactly equivalent to
UNIQUE + NOT NULL. That was more-or-less true at one time in our
implementation, but the standard doesn't say that, and we've grown various
features (many of them required by spec) that treat a pkey differently from
less-formal constraints. Per recent discussion on pgsql-general.
I failed to resist the temptation to do some other wordsmithing in the
same area.
Noah Misch [Sat, 6 Feb 2016 01:22:51 +0000 (20:22 -0500)]
Force certain "pljava" custom GUCs to be PGC_SUSET.
Future PL/Java versions will close CVE-2016-0766 by making these GUCs
PGC_SUSET. This PostgreSQL change independently mitigates that PL/Java
vulnerability, helping sites that update PostgreSQL more frequently than
PL/Java. Back-patch to 9.1 (all supported versions).
Tom Lane [Thu, 4 Feb 2016 05:26:10 +0000 (00:26 -0500)]
In pg_dump, ensure that view triggers are processed after view rules.
If a view is split into CREATE TABLE + CREATE RULE to break a circular
dependency, then any triggers on the view must be dumped/reloaded after
the CREATE RULE; else the backend may reject the CREATE TRIGGER because
it's the wrong type of trigger for a plain table. This works all right
in plain dump/restore because of pg_dump's sorting heuristic that places
triggers after rules. However, when using parallel restore, the ordering
must be enforced by a dependency --- and we didn't have one.
Fixing this is a mere matter of adding an addObjectDependency() call,
except that we need to be able to find all the triggers belonging to the
view relation, and there was no easy way to do that. Add fields to
pg_dump's TableInfo struct to remember where the associated TriggerInfo
struct(s) are.
Per bug report from Dennis Kögel. The failure can be exhibited at least
as far back as 9.1, so back-patch to all supported branches.
Michael Meskes [Mon, 1 Feb 2016 12:10:40 +0000 (13:10 +0100)]
Make sure ecpg header files do not have a comment lasting several lines, one of
which is a preprocessor directive. This leads ecpg to incorrectly parse the comment as nested.
Tom Lane [Fri, 29 Jan 2016 09:28:03 +0000 (10:28 +0100)]
Fix incorrect pattern-match processing in psql's \det command.
listForeignTables' invocation of processSQLNamePattern did not match up
with the other ones that handle potentially-schema-qualified names; it
failed to make use of pg_table_is_visible() and also passed the name
arguments in the wrong order. Bug seems to have been aboriginal in commit 0d692a0dc9f0e532. It accidentally sort of worked as long as you didn't
inquire too closely into the behavior, although the silliness was later
exposed by inconsistencies in the test queries added by 59efda3e50ca4de6
(which I probably should have questioned at the time, but didn't).
Per bug #13899 from Reece Hart. Patch by Reece Hart and Tom Lane.
Back-patch to all affected branches.
Tom Lane [Tue, 26 Jan 2016 20:38:33 +0000 (15:38 -0500)]
Fix startup so that log prefix %h works for the log_connections message.
We entirely randomly chose to initialize port->remote_host just after
printing the log_connections message, when we could perfectly well do it
just before, allowing %h and %r to work for that message. Per gripe from
Artem Tomyuk.
Magnus Hagander [Thu, 14 Jan 2016 12:06:03 +0000 (13:06 +0100)]
Properly close token in sspi authentication
We can never leak more than one token, but we shouldn't do that. We
don't bother closing it in the error paths since the process will
exit shortly anyway.
Tom Lane [Wed, 13 Jan 2016 23:55:27 +0000 (18:55 -0500)]
Handle extension members when first setting object dump flags in pg_dump.
pg_dump's original approach to handling extension member objects was to
run around and clear (or set) their dump flags rather late in its data
collection process. Unfortunately, quite a lot of code expects those flags
to be valid before that; which was an entirely reasonable expectation
before we added extensions. In particular, this explains Karsten Hilbert's
recent report of pg_upgrade failing on a database in which an extension
has been installed into the pg_catalog schema. Its objects are initially
marked as not-to-be-dumped on the strength of their schema, and later we
change them to must-dump because we're doing a binary upgrade of their
extension; but we've already skipped essential tasks like making associated
DO_SHELL_TYPE objects.
To fix, collect extension membership data first, and incorporate it in the
initial setting of the dump flags, so that those are once again correct
from the get-go. This has the undesirable side effect of slightly
lengthening the time taken before pg_dump acquires table locks, but testing
suggests that the increase in that window is not very much.
Along the way, get rid of ugly special-case logic for deciding whether
to dump procedural languages, FDWs, and foreign servers; dump decisions
for those are now correct up-front, too.
In 9.3 and up, this also fixes erroneous logic about when to dump event
triggers (basically, they were *always* dumped before). In 9.5 and up,
transform objects had that problem too.
Since this problem came in with extensions, back-patch to all supported
versions.
The first three of these take OID, so a null argument will normally look
like a zero to them, resulting in "ERROR: could not open relation with OID
0" for brin_summarize_new_values, and no action for the pg_stat_reset_XXX
functions. The other three will dump core on a null argument, though this
is mitigated by the fact that they won't do so until after checking that
the caller is superuser or has rolreplication privilege.
In addition, the pg_logical_slot_get/peek[_binary]_changes family was
intentionally marked nonstrict, but failed to make nullness checks on all
the arguments; so again a null-pointer-dereference crash is possible but
only for superusers and rolreplication users.
Add the missing ARGISNULL checks to the latter functions, and mark the
former functions as strict in pg_proc. Make that change in the back
branches too, even though we can't force initdb there, just so that
installations initdb'd in future won't have the issue. Since none of these
bugs rise to the level of security issues (and indeed the pg_stat_reset_XXX
functions hardly misbehave at all), it seems sufficient to do this.
In addition, fix some order-of-operations oddities in the slot_get_changes
family, mostly cosmetic, but not the part that moves the function's last
few operations into the PG_TRY block. As it stood, there was significant
risk for an error to exit without clearing historical information from
the system caches.
The slot_get_changes bugs go back to 9.4 where that code was introduced.
Back-patch appropriate subsets of the pg_proc changes into all active
branches, as well.
Tom Lane [Sat, 9 Jan 2016 18:44:27 +0000 (13:44 -0500)]
Clean up code for widget_in() and widget_out().
Given syntactically wrong input, widget_in() could call atof() with an
indeterminate pointer argument, typically leading to a crash; or if it
didn't do that, it might return a NULL pointer, which again would lead
to a crash since old-style C functions aren't supposed to do things
that way. Fix that by correcting the off-by-one syntax test and
throwing a proper error rather than just returning NULL.
Also, since widget_in and widget_out have been marked STRICT for a
long time, their tests for null inputs are just dead code; remove 'em.
In the oldest branches, also improve widget_out to use snprintf not
sprintf, just to be sure.
In passing, get rid of a long-since-useless sprintf into a local buffer
that nothing further is done with, and make some other minor coding
style cleanups.
In the intended regression-testing usage of these functions, none of
this is very significant; but if the regression test database were
left around in a production installation, these bugs could amount
to a minor security hazard.
Tom Lane [Sat, 9 Jan 2016 18:02:54 +0000 (13:02 -0500)]
Add STRICT to some C functions created by the regression tests.
These functions readily crash when passed a NULL input value. The tests
themselves do not pass NULL values to them; but when the regression
database is used as a basis for fuzz testing, they cause a lot of noise.
Also, if someone were to leave a regression database lying about in a
production installation, these would create a minor security hazard.
Tom Lane [Thu, 7 Jan 2016 23:20:58 +0000 (18:20 -0500)]
Fix unobvious interaction between -X switch and subdirectory creation.
Turns out the only reason initdb -X worked is that pg_mkdir_p won't
whine if you point it at something that's a symlink to a directory.
Otherwise, the attempt to create pg_xlog/ just like all the other
subdirectories would have failed. Let's be a little more explicit
about what's happening. Oversight in my patch for bug #13853
(mea culpa for not testing -X ...)
Tom Lane [Thu, 7 Jan 2016 20:22:01 +0000 (15:22 -0500)]
Use plain mkdir() not pg_mkdir_p() to create subdirectories of PGDATA.
When we're creating subdirectories of PGDATA during initdb, we know darn
well that the parent directory exists (or should exist) and that the new
subdirectory doesn't (or shouldn't). There is therefore no need to use
anything more complicated than mkdir(). Using pg_mkdir_p() just opens us
up to unexpected failure modes, such as the one exhibited in bug #13853
from Nuri Boardman. It's not very clear why pg_mkdir_p() went wrong there,
but it is clear that we didn't need to be trying to create parent
directories in the first place. We're not even saving any code, as proven
by the fact that this patch nets out at minus five lines.
Since this is a response to a field bug report, back-patch to all branches.
Alvaro Herrera [Thu, 7 Jan 2016 14:59:08 +0000 (11:59 -0300)]
Windows: Make pg_ctl reliably detect service status
pg_ctl is using isatty() to verify whether the process is running in a
terminal, and if not it sends its output to Windows' Event Log ... which
does the wrong thing when the output has been redirected to a pipe, as
reported in bug #13592.
To fix, make pg_ctl use the code we already have to detect service-ness:
in the master branch, move src/backend/port/win32/security.c to src/port
(with suitable tweaks so that it runs properly in backend and frontend
environments); pg_ctl already has access to pgport so it Just Works. In
older branches, that's likely to cause trouble, so instead duplicate the
required code in pg_ctl.c.
Author: Michael Paquier
Bug report and diagnosis: Egon Kocjan
Backpatch: all supported branches
Tom Lane [Mon, 4 Jan 2016 22:41:33 +0000 (17:41 -0500)]
Fix treatment of *lpNumberOfBytesRecvd == 0: that's a completion condition.
pgwin32_recv() has treated a non-error return of zero bytes from WSARecv()
as being a reason to block ever since the current implementation was
introduced in commit a4c40f140d23cefb. However, so far as one can tell
from Microsoft's documentation, that is just wrong: what it means is
graceful connection closure (in stream protocols) or receipt of a
zero-length message (in message protocols), and neither case should result
in blocking here. The only reason the code worked at all was that control
then fell into the retry loop, which did *not* treat zero bytes specially,
so we'd get out after only wasting some cycles. But as of 9.5 we do not
normally reach the retry loop and so the bug is exposed, as reported by
Shay Rojansky and diagnosed by Andres Freund.
Remove the unnecessary test on the byte count, and rearrange the code
in the retry loop so that it looks identical to the initial sequence.
Back-patch of commit 90e61df8130dc7051a108ada1219fb0680cb3eb6. The
original plan was to apply this only to 9.5 and up, but after discussion
and buildfarm testing, it seems better to back-patch. The noblock code
path has been at risk of this problem since it was introduced (in 9.0);
if it did happen in pre-9.5 branches, the symptom would be that a walsender
would wait indefinitely rather than noticing a loss of connection. While
we lack proof that the case has been seen in the field, it seems possible
that it's happened without being reported.
Tom Lane [Sun, 3 Jan 2016 00:04:45 +0000 (19:04 -0500)]
Teach pg_dump to quote reloption values safely.
Commit c7e27becd2e6eb93 fixed this on the backend side, but we neglected
the fact that several code paths in pg_dump were printing reloptions
values that had not gotten massaged by ruleutils. Apply essentially the
same quoting logic in those places, too.
Tom Lane [Sat, 2 Jan 2016 20:29:03 +0000 (15:29 -0500)]
Adjust back-branch release note description of commits a2a718b22 et al.
As pointed out by Michael Paquier, recovery_min_apply_delay didn't exist
in 9.0-9.3, making the release note text not very useful. Instead make it
talk about recovery_target_xid, which did exist then.
9.0 is already out of support, but we can fix the text in the newer
branches' copies of its release notes.
Tom Lane [Fri, 1 Jan 2016 20:27:53 +0000 (15:27 -0500)]
Teach flatten_reloptions() to quote option values safely.
flatten_reloptions() supposed that it didn't really need to do anything
beyond inserting commas between reloption array elements. However, in
principle the value of a reloption could be nearly anything, since the
grammar allows a quoted string there. Any restrictions on it would come
from validity checking appropriate to the particular option, if any.
A reloption value that isn't a simple identifier or number could thus lead
to dump/reload failures due to syntax errors in CREATE statements issued
by pg_dump. We've gotten away with not worrying about this so far with
the core-supported reloptions, but extensions might allow reloption values
that cause trouble, as in bug #13840 from Kouhei Sutou.
To fix, split the reloption array elements explicitly, and then convert
any value that doesn't look like a safe identifier to a string literal.
(The details of the quoting rule could be debated, but this way is safe
and requires little code.) While we're at it, also quote reloption names
if they're not safe identifiers; that may not be a likely problem in the
field, but we might as well try to be bulletproof here.
It's been like this for a long time, so back-patch to all supported
branches.
Tom Lane [Fri, 1 Jan 2016 18:42:21 +0000 (13:42 -0500)]
Add some more defenses against silly estimates to gincostestimate().
A report from Andy Colson showed that gincostestimate() was not being
nearly paranoid enough about whether to believe the statistics it finds in
the index metapage. The problem is that the metapage stats (other than the
pending-pages count) are only updated by VACUUM, and in the worst case
could still reflect the index's original empty state even when it has grown
to many entries. We attempted to deal with that by scaling up the stats to
match the current index size, but if nEntries is zero then scaling it up
still gives zero. Moreover, the proportion of pages that are entry pages
vs. data pages vs. pending pages is unlikely to be estimated very well by
scaling if the index is now orders of magnitude larger than before.
We can improve matters by expanding the use of the rule-of-thumb estimates
I introduced in commit 7fb008c5ee59b040: if the index has grown by more
than a cutoff amount (here set at 4X growth) since VACUUM, then use the
rule-of-thumb numbers instead of scaling. This might not be exactly right
but it seems much less likely to produce insane estimates.
I also improved both the scaling estimate and the rule-of-thumb estimate
to account for numPendingPages, since it's reasonable to expect that that
is accurate in any case, and certainly pages that are in the pending list
are not either entry or data pages.
As a somewhat separate issue, adjust the estimation equations that are
concerned with extra fetches for partial-match searches. These equations
suppose that a fraction partialEntries / numEntries of the entry and data
pages will be visited as a consequence of a partial-match search. Now,
it's physically impossible for that fraction to exceed one, but our
estimate of partialEntries is mostly bunk, and our estimate of numEntries
isn't exactly gospel either, so we could arrive at a silly value. In the
example presented by Andy we were coming out with a value of 100, leading
to insane cost estimates. Clamp the fraction to one to avoid that.
Like the previous patch, back-patch to all supported branches; this
problem can be demonstrated in one form or another in all of them.
Tom Lane [Mon, 28 Dec 2015 17:09:00 +0000 (12:09 -0500)]
Document the exponentiation operator as associating left to right.
Common mathematical convention is that exponentiation associates right to
left. We aren't going to change the parser for this, but we could note
it in the operator's description. (It's already noted in the operator
precedence/associativity table, but users might not look there.)
Per bug #13829 from Henrik Pauli.
Alvaro Herrera [Sun, 27 Dec 2015 16:03:19 +0000 (13:03 -0300)]
Add forgotten CHECK_FOR_INTERRUPT calls in pgcrypto's crypt()
Both Blowfish and DES implementations of crypt() can take arbitrarily
long time, depending on the number of rounds specified by the caller;
make sure they can be interrupted.
Alvaro Herrera [Mon, 21 Dec 2015 22:49:15 +0000 (19:49 -0300)]
Rework internals of changing a type's ownership
This is necessary so that REASSIGN OWNED does the right thing with
composite types, to wit, that it also alters ownership of the type's
pg_class entry -- previously, the pg_class entry remained owned by the
original user, which caused later other failures such as the new owner's
inability to use ALTER TYPE to rename an attribute of the affected
composite. Also, if the original owner is later dropped, the pg_class
entry becomes owned by a non-existant user which is bogus.
To fix, create a new routine AlterTypeOwner_oid which knows whether to
pass the request to ATExecChangeOwner or deal with it directly, and use
that in shdepReassignOwner rather than calling AlterTypeOwnerInternal
directly. AlterTypeOwnerInternal is now simpler in that it only
modifies the pg_type entry and recurses to handle a possible array type;
higher-level tasks are handled by either AlterTypeOwner directly or
AlterTypeOwner_oid.
I took the opportunity to add a few more objects to the test rig for
REASSIGN OWNED, so that more cases are exercised. Additional ones could
be added for superuser-only-ownable objects (such as FDWs and event
triggers) but I didn't want to push my luck by adding a new superuser to
the tests on a backpatchable bug fix.
Per bug #13666 reported by Chris Pacejo.
This is a backpatch of commit 756e7b4c9db1 to branches 9.1 -- 9.4.
Alvaro Herrera [Mon, 21 Dec 2015 22:16:15 +0000 (19:16 -0300)]
adjust ACL owners for REASSIGN and ALTER OWNER TO
When REASSIGN and ALTER OWNER TO are used, both the object owner and ACL
list should be changed from the old owner to the new owner. This patch
fixes types, foreign data wrappers, and foreign servers to change their
ACL list properly; they already changed owners properly.
Report by Alexey Bashtanov
This is a backpatch of commit 59367fdf97c (for bug #9923) by Bruce
Momjian to branches 9.1 - 9.4; it wasn't backpatched originally out of
concerns that it would create a backwards compatibility problem, but per
discussion related to bug #13666 that turns out to have been misguided.
(Therefore, the entry in the 9.5 release notes should be removed.)
Note that 9.1 didn't have privileges on types (which were introduced by
commit 729205571e81), so this commit only changes foreign-data related
objects in that branch.
Tom Lane [Sun, 20 Dec 2015 23:29:52 +0000 (18:29 -0500)]
Remove silly completion for "DELETE FROM tabname ...".
psql offered USING, WHERE, and SET in this context, but SET is not a valid
possibility here. Seems to have been a thinko in commit f5ab0a14ea83eb6c
which added DELETE's USING option.
Tom Lane [Thu, 17 Dec 2015 21:55:23 +0000 (16:55 -0500)]
Fix improper initialization order for readline.
Turns out we must set rl_basic_word_break_characters *before* we call
rl_initialize() the first time, because it will quietly copy that value
elsewhere --- but only on the first call. (Love these undocumented
dependencies.) I broke this yesterday in commit 2ec477dc8108339d;
like that commit, back-patch to all active branches. Per report from
Pavel Stehule.
Tom Lane [Wed, 16 Dec 2015 21:58:56 +0000 (16:58 -0500)]
Cope with Readline's failure to track SIGWINCH events outside of input.
It emerges that libreadline doesn't notice terminal window size change
events unless they occur while collecting input. This is easy to stumble
over if you resize the window while using a pager to look at query output,
but it can be demonstrated without any pager involvement. The symptom is
that queries exceeding one line are misdisplayed during subsequent input
cycles, because libreadline has the wrong idea of the screen dimensions.
The safest, simplest way to fix this is to call rl_reset_screen_size()
just before calling readline(). That causes an extra ioctl(TIOCGWINSZ)
for every command; but since it only happens when reading from a tty, the
performance impact should be negligible. A more valid objection is that
this still leaves a tiny window during entry to readline() wherein delivery
of SIGWINCH will be missed; but the practical consequences of that are
probably negligible. In any case, there doesn't seem to be any good way to
avoid the race, since readline exposes no functions that seem safe to call
from a generic signal handler --- rl_reset_screen_size() certainly isn't.
It turns out that we also need an explicit rl_initialize() call, else
rl_reset_screen_size() dumps core when called before the first readline()
call.
rl_reset_screen_size() is not present in old versions of libreadline,
so we need a configure test for that. (rl_initialize() is present at
least back to readline 4.0, so we won't bother with a test for it.)
We would need a configure test anyway since libedit's emulation of
libreadline doesn't currently include such a function. Fortunately,
libedit seems not to have any corresponding bug.
Alvaro Herrera [Mon, 14 Dec 2015 19:44:40 +0000 (16:44 -0300)]
Add missing CHECK_FOR_INTERRUPTS in lseg_inside_poly
Apparently, there are bugs in this code that cause it to loop endlessly.
That bug still needs more research, but in the meantime it's clear that
the loop is missing a check for interrupts so that it can be cancelled
timely.
Backpatch to 9.1 -- this has been missing since 49475aab8d0d.
Fix out-of-memory error handling in ParameterDescription message processing.
If libpq ran out of memory while constructing the result set, it would hang,
waiting for more data from the server, which might never arrive. To fix,
distinguish between out-of-memory error and not-enough-data cases, and give
a proper error message back to the client on OOM.
There are still similar issues in handling COPY start messages, but let's
handle that as a separate patch.
Michael Paquier, Amit Kapila and me. Backpatch to all supported versions.
Tom Lane [Mon, 14 Dec 2015 04:42:54 +0000 (23:42 -0500)]
Docs: document that psql's "\i -" means read from stdin.
This has worked that way for a long time, maybe always, but you would
not have known it from the documentation. Also back-patch the notes
I added to HEAD earlier today about behavior of the "-f -" switch,
which likewise have been valid for many releases.
Andres Freund [Sat, 12 Dec 2015 13:19:29 +0000 (14:19 +0100)]
Fix ALTER TABLE ... SET TABLESPACE for unlogged relations.
Changing the tablespace of an unlogged relation did not WAL log the
creation and content of the init fork. Thus, after a standby is
promoted, unlogged relation cannot be accessed anymore, with errors
like:
ERROR: 58P01: could not open file "pg_tblspc/...": No such file or directory
Additionally the init fork was not synced to disk, independent of the
configured wal_level, a relatively small durability risk.
Investigation of that problem also brought to light that, even for
permanent relations, the creation of !main forks was not WAL logged,
i.e. no XLOG_SMGR_CREATE record were emitted. That mostly turns out not
to be a problem, because these files were created when the actual
relation data is copied; nonexistent files are not treated as an error
condition during replay. But that doesn't work for empty files, and
generally feels a bit haphazard. Luckily, outside init and main forks,
empty forks don't occur often or are not a problem.
Add the required WAL logging and syncing to disk.
Reported-By: Michael Paquier
Author: Michael Paquier and Andres Freund
Discussion: 20151210163230.GA11331@alap3.anarazel.de
Backpatch: 9.1, where unlogged relations were introduced
Tom Lane [Sat, 12 Dec 2015 00:08:40 +0000 (19:08 -0500)]
Add an expected-file to match behavior of latest libxml2.
Recent releases of libxml2 do not provide error context reports for errors
detected at the very end of the input string. This appears to be a bug, or
at least an infelicity, introduced by the fix for libxml2's CVE-2015-7499.
We can hope that this behavioral change will get undone before too long;
but the security patch is likely to spread a lot faster/further than any
follow-on cleanup, which means this behavior is likely to be present in the
wild for some time to come. As a stopgap, add a variant regression test
expected-file that matches what you get with a libxml2 that acts this way.
Alvaro Herrera [Fri, 11 Dec 2015 21:39:09 +0000 (18:39 -0300)]
For REASSIGN OWNED for foreign user mappings
As reported in bug #13809 by Alexander Ashurkov, the code for REASSIGN
OWNED hadn't gotten word about user mappings. Deal with them in the
same way default ACLs do, which is to ignore them altogether; they are
handled just fine by DROP OWNED. The other foreign object cases are
already handled correctly by both commands.
Also add a REASSIGN OWNED statement to foreign_data test to exercise the
foreign data objects. (The changes are just before the "cleanup" phase,
so it shouldn't remove any existing live test.)
Reported by Alexander Ashurkov, then independently by Jaime Casanova.
Andres Freund [Thu, 10 Dec 2015 15:25:12 +0000 (16:25 +0100)]
Fix bug leading to restoring unlogged relations from empty files.
At the end of crash recovery, unlogged relations are reset to the empty
state, using their init fork as the template. The init fork is copied to
the main fork without going through shared buffers. Unfortunately WAL
replay so far has not necessarily flushed writes from shared buffers to
disk at that point. In normal crash recovery, and before the
introduction of 'fast promotions' in fd4ced523 / 9.3, the
END_OF_RECOVERY checkpoint flushes the buffers out in time. But with
fast promotions that's not the case anymore.
To fix, force WAL writes targeting the init fork to be flushed
immediately (using the new FlushOneBuffer() function). In 9.5+ that
flush can centrally be triggered from the code dealing with restoring
full page writes (XLogReadBufferForRedoExtended), in earlier releases
that responsibility is in the hands of XLOG_HEAP_NEWPAGE's replay
function.
Backpatch to 9.1, even if this currently is only known to trigger in
9.3+. Flushing earlier is more robust, and it is advantageous to keep
the branches similar.
Typical symptoms of this bug are errors like
'ERROR: index "..." contains unexpected zero page at block 0'
shortly after promoting a node.
Reported-By: Thom Brown
Author: Andres Freund and Michael Paquier
Discussion: 20150326175024.GJ451@alap3.anarazel.de
Backpatch: 9.1-
Tom Lane [Fri, 4 Dec 2015 19:44:13 +0000 (14:44 -0500)]
Further improve documentation of the role-dropping process.
In commit 1ea0c73c2 I added a section to user-manag.sgml about how to drop
roles that own objects; but as pointed out by Stephen Frost, I neglected
that shared objects (databases or tablespaces) may need special treatment.
Fix that. Back-patch to supported versions, like the previous patch.
Tom Lane [Tue, 1 Dec 2015 21:24:35 +0000 (16:24 -0500)]
Make gincostestimate() cope with hypothetical GIN indexes.
We tried to fetch statistics data from the index metapage, which does not
work if the index isn't actually present. If the index is hypothetical,
instead extrapolate some plausible internal statistics based on the index
page count provided by the index-advisor plugin.
There was already some code in gincostestimate() to invent internal stats
in this way, but since it was only meant as a stopgap for pre-9.1 GIN
indexes that hadn't been vacuumed since upgrading, it was pretty crude.
If we want it to support index advisors, we should try a little harder.
A small amount of testing says that it's better to estimate the entry pages
as 90% of the index, not 100%. Also, estimating the number of entries
(keys) as equal to the heap tuple count could be wildly wrong in either
direction. Instead, let's estimate 100 entries per entry page.
Perhaps someday somebody will want the index advisor to be able to provide
these numbers more directly, but for the moment this should serve.
Problem report and initial patch by Julien Rouhaud; modified by me to
invent less-bogus internal statistics. Back-patch to all supported
branches, since we've supported index advisors since 9.0.
Tom Lane [Tue, 1 Dec 2015 16:42:25 +0000 (11:42 -0500)]
Use "g" not "f" format in ecpg's PGTYPESnumeric_from_double().
The previous coding could overrun the provided buffer size for a very large
input, or lose precision for a very small input. Adopt the methodology
that's been in use in the equivalent backend code for a long time.
Per private report from Bas van Schaik. Back-patch to all supported
branches.
Tom Lane [Thu, 26 Nov 2015 18:23:03 +0000 (13:23 -0500)]
Fix failure to consider failure cases in GetComboCommandId().
Failure to initially palloc the comboCids array, or to realloc it bigger
when needed, left combocid's data structures in an inconsistent state that
would cause trouble if the top transaction continues to execute. Noted
while examining a user complaint about the amount of memory used for this.
(There's not much we can do about that, but it does point up that repalloc
failure has a non-negligible chance of occurring here.)
In HEAD/9.5, also avoid possible invocation of memcpy() with a null pointer
in SerializeComboCIDState; cf commit 13bba0227.
Tom Lane [Wed, 25 Nov 2015 22:31:54 +0000 (17:31 -0500)]
Be more paranoid about null return values from libpq status functions.
PQhost() can return NULL in non-error situations, namely when a Unix-socket
connection has been selected by default. That behavior is a tad debatable
perhaps, but for the moment we should make sure that psql copes with it.
Unfortunately, do_connect() failed to: it could pass a NULL pointer to
strcmp(), resulting in crashes on most platforms. This was reported as a
security issue by ChenQin of Topsec Security Team, but the consensus of
the security list is that it's just a garden-variety bug with no security
implications.
For paranoia's sake, I made the keep_password test not trust PQuser or
PQport either, even though I believe those will never return NULL given
a valid PGconn.
Tom Lane [Sun, 22 Nov 2015 01:21:32 +0000 (20:21 -0500)]
Adopt the GNU convention for handling tar-archive members exceeding 8GB.
The POSIX standard for tar headers requires archive member sizes to be
printed in octal with at most 11 digits, limiting the representable file
size to 8GB. However, GNU tar and apparently most other modern tars
support a convention in which oversized values can be stored in base-256,
allowing any practical file to be a tar member. Adopt this convention
to remove two limitations:
* pg_dump with -Ft output format failed if the contents of any one table
exceeded 8GB.
* pg_basebackup failed if the data directory contained any file exceeding
8GB. (This would be a fatal problem for installations configured with a
table segment size of 8GB or more, and it has also been seen to fail when
large core dump files exist in the data directory.)
File sizes under 8GB are still printed in octal, so that no compatibility
issues are created except in cases that would have failed entirely before.
In addition, this patch fixes several bugs in the same area:
* In 9.3 and later, we'd defined tarCreateHeader's file-size argument as
size_t, which meant that on 32-bit machines it would write a corrupt tar
header for file sizes between 4GB and 8GB, even though no error was raised.
This broke both "pg_dump -Ft" and pg_basebackup for such cases.
* pg_restore from a tar archive would fail on tables of size between 4GB
and 8GB, on machines where either "size_t" or "unsigned long" is 32 bits.
This happened even with an archive file not affected by the previous bug.
* pg_basebackup would fail if there were files of size between 4GB and 8GB,
even on 64-bit machines.
* In 9.3 and later, "pg_basebackup -Ft" failed entirely, for any file size,
on 64-bit big-endian machines.
In view of these potential data-loss bugs, back-patch to all supported
branches, even though removal of the documented 8GB limit might otherwise
be considered a new feature rather than a bug fix.
Tom Lane [Fri, 20 Nov 2015 19:55:29 +0000 (14:55 -0500)]
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again).
The previous way of reconstructing check constraints was to do a separate
"ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance
hierarchy. However, that way has no hope of reconstructing the check
constraints' own inheritance properties correctly, as pointed out in
bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do
a regular "ALTER TABLE", allowing recursion, at the topmost table that
has a particular constraint, and then suppress the work queue entries
for inherited instances of the constraint.
Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf,
but we failed to notice that it wasn't reconstructing the pg_constraint
field values correctly.
As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to
always schema-qualify the target table name; this seems like useful backup
to the protections installed by commit 5f173040.
In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused.
(I could alternatively have modified it to also return conislocal, but that
seemed like a pretty single-purpose API, so let's not pretend it has some
other use.) It's unused in the back branches as well, but I left it in
place just in case some third-party code has decided to use it.
In HEAD/9.5, also rename pg_get_constraintdef_string to
pg_get_constraintdef_command, as the previous name did nothing to explain
what that entry point did differently from others (and its comment was
equally useless). Again, that change doesn't seem like material for
back-patching.
I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well.
Tom Lane [Wed, 18 Nov 2015 22:45:06 +0000 (17:45 -0500)]
Accept flex > 2.5.x in configure.
Per buildfarm member anchovy, 2.6.0 exists in the wild now.
Hopefully it works with Postgres; if not, we'll have to do something
about that, but in any case claiming it's "too old" is pretty silly.
Tom Lane [Tue, 17 Nov 2015 20:46:47 +0000 (15:46 -0500)]
Fix possible internal overflow in numeric division.
div_var_fast() postpones propagating carries in the same way as mul_var(),
so it has the same corner-case overflow risk we fixed in 246693e5ae8a36f0,
namely that the size of the carries has to be accounted for when setting
the threshold for executing a carry propagation step. We've not devised
a test case illustrating the brokenness, but the required fix seems clear
enough. Like the previous fix, back-patch to all active branches.
Tom Lane [Sun, 15 Nov 2015 19:41:09 +0000 (14:41 -0500)]
Fix ruleutils.c's dumping of whole-row Vars in ROW() and VALUES() contexts.
Normally ruleutils prints a whole-row Var as "foo.*". We already knew that
that doesn't work at top level of a SELECT list, because the parser would
treat the "*" as a directive to expand the reference into separate columns,
not a whole-row Var. However, Joshua Yanovski points out in bug #13776
that the same thing happens at top level of a ROW() construct; and some
nosing around in the parser shows that the same is true in VALUES().
Hence, apply the same workaround already devised for the SELECT-list case,
namely to add a forced cast to the appropriate rowtype in these cases.
(The alternative of just printing "foo" was rejected because it is
difficult to avoid ambiguity against plain columns named "foo".)
Tom Lane [Tue, 10 Nov 2015 20:59:59 +0000 (15:59 -0500)]
Improve our workaround for 'TeX capacity exceeded' in building PDF files.
In commit a5ec86a7c787832d28d5e50400ec96a5190f2555 I wrote a quick hack
that reduced the number of TeX string pool entries created while converting
our documentation to PDF form. That held the fort for awhile, but as of
HEAD we're back up against the same limitation. It turns out that the
original coding of \FlowObjectSetup actually results in *three* string pool
entries being generated for every "flow object" (that is, potential
cross-reference target) in the documentation, and my previous hack only got
rid of one of them. With a little more care, we can reduce the string
count to one per flow object plus one per actually-cross-referenced flow
object (about 115000 + 5000 as of current HEAD); that should work until
the documentation volume roughly doubles from where it is today.
As a not-incidental side benefit, this change also causes pdfjadetex to
stop emitting unreferenced hyperlink anchors (bookmarks) into the PDF file.
It had been making one willy-nilly for every flow object; now it's just one
per actually-cross-referenced object. This results in close to a 2X
savings in PDF file size. We will still want to run the output through
"jpdftweak" to get it to be compressed; but we no longer need removal of
unreferenced bookmarks, so we might be able to find a quicker tool for
that step.
Although the failure only affects HEAD and US-format output at the moment,
9.5 cannot be more than a few pages short of failing likewise, so it
will inevitably fail after a few rounds of minor-version release notes.
I don't have a lot of faith that we'll never hit the limit in the older
branches; and anyway it would be nice to get rid of jpdftweak across the
board. Therefore, back-patch to all supported branches.
Noah Misch [Sun, 8 Nov 2015 22:28:53 +0000 (17:28 -0500)]
Don't connect() to a wildcard address in test_postmaster_connection().
At least OpenBSD, NetBSD, and Windows don't support it. This repairs
pg_ctl for listen_addresses='0.0.0.0' and listen_addresses='::'. Since
pg_ctl prefers to test a Unix-domain socket, Windows users are most
likely to need this change. Back-patch to 9.1 (all supported versions).
This could change pg_ctl interaction with loopback-interface firewall
rules. Therefore, in 9.4 and earlier (released branches), activate the
change only on known-affected platforms.
Tom Lane [Sat, 7 Nov 2015 17:43:24 +0000 (12:43 -0500)]
Fix enforcement of restrictions inside regexp lookaround constraints.
Lookahead and lookbehind constraints aren't allowed to contain backrefs,
and parentheses within them are always considered non-capturing. Or so
says the manual. But the regexp parser forgot about these rules once
inside a parenthesized subexpression, so that constructs like (\w)(?=(\1))
were accepted (but then not correctly executed --- a case like this acted
like (\w)(?=\w), without any enforcement that the two \w's match the same
text). And in (?=((foo))) the innermost parentheses would be counted as
capturing parentheses, though no text would ever be captured for them.
To fix, properly pass down the "type" argument to the recursive invocation
of parse().
Back-patch to all supported branches; it was agreed that silent
misexecution of such patterns is worse than throwing an error, even though
new errors in minor releases are generally not desirable.
Kevin Grittner [Sat, 31 Oct 2015 19:36:58 +0000 (14:36 -0500)]
Fix serialization anomalies due to race conditions on INSERT.
On insert the CheckForSerializableConflictIn() test was performed
before the page(s) which were going to be modified had been locked
(with an exclusive buffer content lock). If another process
acquired a relation SIReadLock on the heap and scanned to a page on
which an insert was going to occur before the page was so locked,
a rw-conflict would be missed, which could allow a serialization
anomaly to be missed. The window between the check and the page
lock was small, so the bug was generally not noticed unless there
was high concurrency with multiple processes inserting into the
same table.
This was reported by Peter Bailis as bug #11732, by Sean Chittenden
as bug #13667, and by others.
The race condition was eliminated in heap_insert() by moving the
check down below the acquisition of the buffer lock, which had been
the very next statement. Because of the loop locking and unlocking
multiple buffers in heap_multi_insert() a check was added after all
inserts were completed. The check before the start of the inserts
was left because it might avoid a large amount of work to detect a
serialization anomaly before performing the all of the inserts and
the related WAL logging.
While investigating this bug, other SSI bugs which were even harder
to hit in practice were noticed and fixed, an unnecessary check
(covered by another check, so redundant) was removed from
heap_update(), and comments were improved.
Noah Misch [Tue, 20 Oct 2015 04:37:22 +0000 (00:37 -0400)]
Eschew "RESET statement_timeout" in tests.
Instead, use transaction abort. Given an unlucky bout of latency, the
timeout would cancel the RESET itself. Buildfarm members gharial,
lapwing, mereswine, shearwater, and sungazer witness that. Back-patch
to 9.1 (all supported versions). The query_canceled test still could
timeout before entering its subtransaction; for whatever reason, that
has yet to happen on the buildfarm.
Tom Lane [Mon, 19 Oct 2015 20:54:54 +0000 (13:54 -0700)]
Fix incorrect handling of lookahead constraints in pg_regprefix().
pg_regprefix was doing nothing with lookahead constraints, which would
be fine if it were the right kind of nothing, but it isn't: we have to
terminate our search for a fixed prefix, not just pretend the LACON arc
isn't there. Otherwise, if the current state has both a LACON outarc and a
single plain-color outarc, we'd falsely conclude that the color represents
an addition to the fixed prefix, and generate an extracted index condition
that restricts the indexscan too much. (See added regression test case.)
Terminating the search is conservative: we could traverse the LACON arc
(thus assuming that the constraint can be satisfied at runtime) and then
examine the outarcs of the linked-to state. But that would be a lot more
work than it seems worth, because writing a LACON followed by a single
plain character is a pretty silly thing to do.
This makes a difference only in rather contrived cases, but it's a bug,
so back-patch to all supported branches.
Tom Lane [Fri, 16 Oct 2015 19:52:12 +0000 (15:52 -0400)]
Miscellaneous cleanup of regular-expression compiler.
Revert our previous addition of "all" flags to copyins() and copyouts();
they're no longer needed, and were never anything but an unsightly hack.
Improve a couple of infelicities in the REG_DEBUG code for dumping
the NFA data structure, including adding code to count the total
number of states and arcs.
Add a couple of missed error checks.
Add some more documentation in the README file, and some regression tests
illustrating cases that exceeded the state-count limit and/or took
unreasonable amounts of time before this set of patches.