Tom Lane [Wed, 22 Oct 2014 22:41:44 +0000 (18:41 -0400)]
Ensure libpq reports a suitable error message on unexpected socket EOF.
The EOF-detection logic in pqReadData was a bit confused about who should
set up the error message in case the kernel gives us read-ready-but-no-data
rather than ECONNRESET or some other explicit error condition. Since the
whole point of this situation is that the lower-level functions don't know
there's anything wrong, pqReadData itself must set up the message. But
keep the assumption that if an errno was reported, a message was set up at
lower levels.
Per bug #11712 from Marko Tiikkaja. It's been like this for a very long
time, so back-patch to all supported branches.
Noah Misch [Wed, 22 Oct 2014 02:55:47 +0000 (22:55 -0400)]
MinGW: Use -static-libgcc when linking a DLL.
When commit 846e91e0223cf9f2821c3ad4dfffffbb929cb027 switched the linker
driver from dlltool/dllwrap to gcc, it became possible for linking to
choose shared libgcc. Backends having loaded a module dynamically
linked to libgcc can exit abnormally, which the postmaster treats like a
crash. Resume use of static libgcc exclusively, like 9.3 and earlier.
Back-patch to 9.4.
Noah Misch [Wed, 22 Oct 2014 02:55:43 +0000 (22:55 -0400)]
MinGW: Link with shell32.dll instead of shfolder.dll.
This improves consistency with the MSVC build. On buildfarm member
narwhal, since commit 846e91e0223cf9f2821c3ad4dfffffbb929cb027,
shfolder.dll:SHGetFolderPath() crashes when dblink calls it by way of
pqGetHomeDirectory(). Back-patch to 9.4, where that commit first
appeared. How it caused this regression remains a mystery. This is a
partial revert of commit 889f03812916b146ae504c0fad5afdc7bf2e8a2a, which
adopted shfolder.dll for Windows NT 4.0 compatibility. PostgreSQL 8.2
dropped support for that operating system.
Peter Eisentraut [Tue, 21 Oct 2014 18:46:38 +0000 (14:46 -0400)]
doc: Check DocBook XML validity during the build
Building the documentation with XSLT does not check the DTD, like a
DSSSL build would. One can often get away with having invalid XML, but
the stylesheets might then create incorrect output, as they are not
designed to handle that. Therefore, check the validity of the XML
against the DTD, using xmllint, during the build.
Add xmllint detection to configure, and add some documentation.
xmllint comes with libxml2, which is already in use, but it might be in
a separate package, such as libxml2-utils on Debian.
Peter Eisentraut [Tue, 21 Oct 2014 14:43:09 +0000 (10:43 -0400)]
doc: Update Red Hat documentation tools information
The old text was written in ancient times when RPM packages could be
shared more or less freely across a plethora of RPM-based Linux
distributions. This isn't really the case anymore, so just make this
information more concrete for the Red Hat family.
Andres Freund [Mon, 20 Oct 2014 23:01:56 +0000 (01:01 +0200)]
Don't duplicate log_checkpoint messages for both of restart and checkpoints.
The duplication originated in cdd46c765, where restartpoints were
introduced.
In LogCheckpointStart's case the duplication actually lead to the
compiler's format string checking not to be effective because the
format string wasn't constant.
Arguably these messages shouldn't be elog(), but ereport() style
messages. That'd even allow to translate the messages... But as
there's more mistakes of that kind in surrounding code, it seems
better to change that separately.
Andres Freund [Mon, 20 Oct 2014 22:20:08 +0000 (00:20 +0200)]
Renumber CHECKPOINT_* flags.
Commit 7dbb6069382 added a new CHECKPOINT_FLUSH_ALL flag. As that
commit needed to be backpatched I didn't change the numeric values of
the existing flags as that could lead to nastly problems if any
external code issued checkpoints. That's not a concern on master, so
renumber them there.
Also add a comment about CHECKPOINT_FLUSH_ALL above
CreateCheckPoint().
Andres Freund [Mon, 20 Oct 2014 21:43:46 +0000 (23:43 +0200)]
Flush unlogged table's buffers when copying or moving databases.
CREATE DATABASE and ALTER DATABASE .. SET TABLESPACE copy the source
database directory on the filesystem level. To ensure the on disk
state is consistent they block out users of the affected database and
force a checkpoint to flush out all data to disk. Unfortunately, up to
now, that checkpoint didn't flush out dirty buffers from unlogged
relations.
That bug means there could be leftover dirty buffers in either the
template database, or the database in its old location. Leading to
problems when accessing relations in an inconsistent state; and to
possible problems during shutdown in the SET TABLESPACE case because
buffers belonging files that don't exist anymore are flushed.
This was reported in bug #10675 by Maxim Boguk.
Fix by Pavan Deolasee, modified somewhat by me. Reviewed by MauMau and
Fujii Masao.
Backpatch to 9.1 where unlogged tables were introduced.
Andrew Dunstan [Mon, 20 Oct 2014 18:55:35 +0000 (14:55 -0400)]
Correct volatility markings of a few json functions.
json_agg and json_object_agg and their associated transition functions
should have been marked as stable rather than immutable, as they call IO
functions indirectly. Changing this probably isn't going to make much
difference, as you can't use an aggregate function in an index
expression, but we should be correct nevertheless.
json_object, on the other hand, should be marked immutable rather than
stable, as it does not call IO functions.
As discussed on -hackers, this change is being made without bumping the
catalog version, as we don't want to do that at this stage of the cycle,
and the changes are very unlikely to affect anyone.
Tom Lane [Mon, 20 Oct 2014 16:23:42 +0000 (12:23 -0400)]
Fix mishandling of FieldSelect-on-whole-row-Var in nested lateral queries.
If an inline-able SQL function taking a composite argument is used in a
LATERAL subselect, and the composite argument is a lateral reference,
the planner could fail with "variable not found in subplan target list",
as seen in bug #11703 from Karl Bartel. (The outer function call used in
the bug report and in the committed regression test is not really necessary
to provoke the bug --- you can get it if you manually expand the outer
function into "LATERAL (SELECT inner_function(outer_relation))", too.)
The cause of this is that we generate the reltargetlist for the referenced
relation before doing eval_const_expressions() on the lateral sub-select's
expressions (cf find_lateral_references()), so what's scheduled to be
emitted by the referenced relation is a whole-row Var, not the simplified
single-column Var produced by optimizing the function's FieldSelect on the
whole-row Var. Then setrefs.c fails to match up that lateral reference to
what's available from the outer scan.
Preserving the FieldSelect optimization in such cases would require either
major planner restructuring (to recursively do expression simplification
on sub-selects much earlier) or some amazingly ugly kluge to change the
reltargetlist of a possibly-already-planned relation. It seems better
just to skip the optimization when the Var is from an upper query level;
the case is not so common that it's likely anyone will notice a few
wasted cycles.
AFAICT this problem only occurs for uplevel LATERAL references, so
back-patch to 9.3 where LATERAL was added.
Peter Eisentraut [Sun, 19 Oct 2014 01:58:17 +0000 (21:58 -0400)]
psql: Improve \pset without arguments
Revert the output of the individual backslash commands that change print
settings back to the 9.3 way (not showing the command name in
parentheses). Implement \pset without arguments separately, showing all
settings with values in a table form.
Tom Lane [Sat, 18 Oct 2014 02:55:20 +0000 (22:55 -0400)]
Declare mkdtemp() only if we're providing it.
Follow our usual style of providing an "extern" for a standard library
function only when we're also providing the implementation. This avoids
issues when the system headers declare the function slightly differently
than we do, as noted by Caleb Welton.
We might have to go to the extent of probing to see if the system headers
declare the function, but let's not do that until it's demonstrated to be
necessary.
Tom Lane [Sat, 18 Oct 2014 02:32:55 +0000 (22:32 -0400)]
Avoid core dump in _outPathInfo() for Path without a parent RelOptInfo.
Nearly all Paths have parents, but a ResultPath representing an empty FROM
clause does not. Avoid a core dump in such cases. I believe this is only
a hazard for debugging usage, not for production, else we'd have heard
about it before. Nonetheless, back-patch to 9.1 where the troublesome code
was introduced. Noted while poking at bug #11703.
Fujii Masao [Fri, 17 Oct 2014 18:06:34 +0000 (03:06 +0900)]
Fix bug in handling of connections that pg_receivexlog creates.
Previously pg_receivexlog created new connection for WAL streaming
even though another connection which had been established to create
or delete the replication slot was being left. This caused the unused
connection to be left uselessly until pg_receivexlog exited.
This bug was introduced by the commit d9f38c7.
This patch changes pg_receivexlog so that the connection for
the replication slot is reused for WAL streaming.
Andres Freund, slightly modified by me, reviewed by Michael Paquier
Tom Lane [Fri, 17 Oct 2014 16:49:00 +0000 (12:49 -0400)]
Fix core dump in pg_dump --binary-upgrade on zero-column composite type.
This reverts nearly all of commit 28f6cab61ab8958b1a7dfb019724687d92722538
in favor of just using the typrelid we already have in pg_dump's TypeInfo
struct for the composite type. As coded, it'd crash if the composite type
had no attributes, since then the query would return no rows.
Back-patch to all supported versions. It seems to not really be a problem
in 9.0 because that version rejects the syntax "create type t as ()", but
we might as well keep the logic similar in all affected branches.
Tom Lane [Thu, 16 Oct 2014 19:22:10 +0000 (15:22 -0400)]
Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as
"EDT") represents a constant GMT offset in the usage of any particular
region; we had a way to configure what that offset was, but not for it
to be changeable over time. But, as with most things horological, this
view of the world is too simplistic: there are numerous regions that have
at one time or another switched to a different GMT offset but kept using
the same timezone abbreviation. Almost the entire Russian Federation did
that a few years ago, and later this month they're going to do it again.
And there are similar examples all over the world.
To cope with this, invent the notion of a "dynamic timezone abbreviation",
which is one that is referenced to a particular underlying timezone
(as defined in the IANA timezone database) and means whatever it currently
means in that zone. For zones that use or have used daylight-savings time,
the standard and DST abbreviations continue to have the property that you
can specify standard or DST time and get that time offset whether or not
DST was theoretically in effect at the time. However, the abbreviations
mean what they meant at the time in question (or most recently before that
time) rather than being absolutely fixed.
The standard abbreviation-list files have been changed to use this behavior
for abbreviations that have actually varied in meaning since 1970. The
old simple-numeric definitions are kept for abbreviations that have not
changed, since they are a bit faster to resolve.
While this is clearly a new feature, it seems necessary to back-patch it
into all active branches, because otherwise use of Russian zone
abbreviations is going to become even more problematic than it already was.
This change supersedes the changes in commit 513d06ded et al to modify the
fixed meanings of the Russian abbreviations; since we've not shipped that
yet, this will avoid an undesirably incompatible (not to mention incorrect)
change in behavior for timestamps between 2011 and 2014.
This patch makes some cosmetic changes in ecpglib to keep its usage of
datetime lookup tables as similar as possible to the backend code, but
doesn't do anything about the increasingly obsolete set of timezone
abbreviation definitions that are hard-wired into ecpglib. Whatever we
do about that will likely not be appropriate material for back-patching.
Also, a potential free() of a garbage pointer after an out-of-memory
failure in ecpglib has been fixed.
This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that
caused it to produce unexpected results near a timezone transition, if
both the "before" and "after" states are marked as standard time. We'd
only ever thought about or tested transitions between standard and DST
time, but that's not what's happening when a zone simply redefines their
base GMT offset.
In passing, update the SGML documentation to refer to the Olson/zoneinfo/
zic timezone database as the "IANA" database, since it's now being
maintained under the auspices of IANA.
Tom Lane [Wed, 15 Oct 2014 22:50:13 +0000 (18:50 -0400)]
Print planning time only in EXPLAIN ANALYZE, not plain EXPLAIN.
We've gotten enough push-back on that change to make it clear that it
wasn't an especially good idea to do it like that. Revert plain EXPLAIN
to its previous behavior, but keep the extra output in EXPLAIN ANALYZE.
Per discussion.
Internally, I set this up as a separate flag ExplainState.summary that
controls printing of planning time and execution time. For now it's
just copied from the ANALYZE option, but we could consider exposing it
to users.
Alvaro Herrera [Tue, 14 Oct 2014 18:00:55 +0000 (15:00 -0300)]
pg_dump: Reduce use of global variables
Most pg_dump.c global variables, which were passed down individually to
dumping routines, are now grouped as members of the new DumpOptions
struct, which is used as a local variable and passed down into routines
that need it. This helps future development efforts; in particular it
is said to enable a mode in which a parallel pg_dump run can output
multiple streams, and have them restored in parallel.
Also take the opportunity to clean up the pg_dump header files somewhat,
to avoid circularity.
Author: Joachim Wieland, revised by Álvaro Herrera
Reviewed by Peter Eisentraut
Fix deadlock with LWLockAcquireWithVar and LWLockWaitForVar.
LWLockRelease should release all backends waiting with LWLockWaitForVar,
even when another backend has already been woken up to acquire the lock,
i.e. when releaseOK is false. LWLockWaitForVar can return as soon as the
protected value changes, even if the other backend will acquire the lock.
Fix that by resetting releaseOK to true in LWLockWaitForVar, whenever
adding itself to the wait queue.
This should fix the bug reported by MauMau, where the system occasionally
hangs when there is a lot of concurrent WAL activity and a checkpoint.
Backpatch to 9.4, where this code was added.
Peter Eisentraut [Tue, 14 Oct 2014 02:17:34 +0000 (22:17 -0400)]
doc: Improve ALTER VIEW / SET documentation
The way the ALTER VIEW / SET options were listed in the synopsis was
very confusing. Move the list to the main description, similar to how
the ALTER TABLE reference page does it.
This allows transactions that take longer than specified limit to be counted
separately. With --rate, transactions that are already late by the time we
get to execute them are skipped altogether. Using --latency-limit with
--rate allows you to "catch up" more quickly, if there's a hickup in the
server causing a lot of transactions to stall momentarily.
Fabien COELHO, reviewed by Rukh Meski and heavily refactored by me.
Kevin Grittner [Mon, 13 Oct 2014 15:16:36 +0000 (10:16 -0500)]
Increase number of hash join buckets for underestimate.
If we expect batching at the very beginning, we size nbuckets for
"full work_mem" (see how many tuples we can get into work_mem,
while not breaking NTUP_PER_BUCKET threshold).
If we expect to be fine without batching, we start with the 'right'
nbuckets and track the optimal nbuckets as we go (without actually
resizing the hash table). Once we hit work_mem (considering the
optimal nbuckets value), we keep the value.
At the end of the first batch, we check whether (nbuckets !=
nbuckets_optimal) and resize the hash table if needed. Also, we
keep this value for all batches (it's OK because it assumes full
work_mem, and it makes the batchno evaluation trivial). So the
resize happens only once.
There could be cases where it would improve performance to allow
the NTUP_PER_BUCKET threshold to be exceeded to keep everything in
one batch rather than spilling to a second batch, but attempts to
generate such a case have so far been unsuccessful; that issue may
be addressed with a follow-on patch after further investigation.
Tomas Vondra with minor format and comment cleanup by me
Reviewed by Robert Haas, Heikki Linnakangas, and Kevin Grittner
Noah Misch [Mon, 13 Oct 2014 03:33:37 +0000 (23:33 -0400)]
Fix quoting in the add_to_path Makefile macro.
The previous quoting caused "make -C src/bin check" to ignore, rather
than add to, any LD_LIBRARY_PATH content from the environment.
Back-patch to 9.4, where the macro was introduced.
Noah Misch [Mon, 13 Oct 2014 03:27:06 +0000 (23:27 -0400)]
Suppress dead, unportable src/port/crypt.c code.
This file used __int64, which is specific to native Windows, rather than
int64. Suppress the long-unused union field of this type. Noticed on
Cygwin x86_64 with -lcrypt not installed. Back-patch to 9.0 (all
supported versions).
Peter Eisentraut [Sun, 12 Oct 2014 05:45:25 +0000 (01:45 -0400)]
pg_recvlogical: Improve --help output
List the actions first, as they are the most important options. Group
the other options more sensibly, consistent with the man page. Correct
a few typographical errors, clarify some things.
Also update the pg_receivexlog --help output to make it a bit more
consistent with that of pg_recvlogical.
Tom Lane [Sat, 11 Oct 2014 18:13:51 +0000 (14:13 -0400)]
Fix bogus optimization in JSONB containment tests.
When determining whether one JSONB object contains another, it's okay to
make a quick exit if the first object has fewer pairs than the second:
because we de-duplicate keys within objects, it is impossible that the
first object has all the keys the second does. However, the code was
applying this rule to JSONB arrays as well, where it does *not* hold
because arrays can contain duplicate entries. The test was really in
the wrong place anyway; we should do it within JsonbDeepContains, where
it can be applied to nested objects not only top-level ones.
Report and test cases by Alexander Korotkov; fix by Peter Geoghegan and
Tom Lane.
Change the way encoding and locale checks are done in pg_upgrade.
Lc_collate and lc_ctype have been per-database settings since server version
8.4, but pg_upgrade was still treating them as cluster-wide options. It
fetched the values for the template0 databases in old and new cluster, and
compared them. That's backwards; the encoding and locale of the template0
database doesn't matter, as template0 is guaranteed to contain only ASCII
characters. But if there are any other databases that exist on both clusters
(in particular template1 and postgres databases), their encodings and
locales must be compatible.
Also, make the locale comparison more lenient. If the locale names are not
equal, try to canonicalize both of them by passing them to setlocale(). We
used to do that only when upgrading from 9.1 or below, but it seems like a
good idea even with newer versions. If we change the canonical form of a
locale, this allows pg_upgrade to still work. I'm about to do just that to
fix bug #11431, by mapping a locale name that contains non-ASCII characters
to a pure-ASCII alias of the same locale.
No backpatching, because earlier versions of pg_upgrade still support
upgrading from 8.3 servers. That would be more complicated, so it doesn't
seem worth it, given that we haven't received any complaints about this
from users.
Robert Haas [Wed, 8 Oct 2014 18:35:43 +0000 (14:35 -0400)]
Extend shm_mq API with new functions shm_mq_sendv, shm_mq_set_handle.
shm_mq_sendv sends a message to the queue assembled from multiple
locations. This is expected to be used by forthcoming patches to
allow frontend/backend protocol messages to be sent via shm_mq, but
might be useful for other purposes as well.
shm_mq_set_handle associates a BackgroundWorkerHandle with an
already-existing shm_mq_handle. This solves a timing problem when
creating a shm_mq to communicate with a newly-launched background
worker: if you attach to the queue first, and the background worker
fails to start, you might block forever trying to do I/O on the queue;
but if you start the background worker first, but then die before
attaching to the queue, the background worrker might block forever
trying to do I/O on the queue. This lets you attach before starting
the worker (so that the worker is protected) and then associate the
BackgroundWorkerHandle later (so that you are also protected).
Alvaro Herrera [Tue, 7 Oct 2014 20:23:34 +0000 (17:23 -0300)]
Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows. While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.
Catalog version bumped because this patch changes the representation of
stored rules.
Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.
Tom Lane [Tue, 7 Oct 2014 01:23:20 +0000 (21:23 -0400)]
Fix array overrun in ecpg's version of ParseDateTime().
The code wrote a value into the caller's field[] array before checking
to see if there was room, which of course is backwards. Per report from
Michael Paquier.
I fixed the equivalent bug in the backend's version of this code way back
in 630684d3a130bb93, but failed to think about ecpg's copy. Fortunately
this doesn't look like it would be exploitable for anything worse than a
core dump: an external attacker would have no control over the single word
that gets written.
Stephen Frost [Mon, 6 Oct 2014 15:18:13 +0000 (11:18 -0400)]
Clean up Create/DropReplicationSlot query buffer
CreateReplicationSlot() and DropReplicationSlot() were not cleaning up
the query buffer in some cases (mostly error conditions) which meant a
small leak. Not generally an issue as the error case would result in an
immediate exit, but not difficult to fix either and reduces the number
of false positives from code analyzers.
In passing, also add appropriate PQclear() calls to RunIdentifySystem().
Andres Freund [Mon, 6 Oct 2014 10:51:37 +0000 (12:51 +0200)]
Add support for managing physical replication slots to pg_receivexlog.
pg_receivexlog already has the capability to use a replication slot to
reserve WAL on the upstream node. But the used slot currently has to
be created via SQL.
To allow using slots directly, without involving SQL, add
--create-slot and --drop-slot actions, analogous to the logical slot
manipulation support in pg_recvlogical.
Author: Michael Paquier
Discussion: CABUevEx+zrOHZOQg+dPapNPFRJdsk59b=TSVf30Z71GnFXhQaw@mail.gmail.com
Andres Freund [Mon, 6 Oct 2014 10:11:52 +0000 (12:11 +0200)]
Rename pg_recvlogical's --create/--drop to --create-slot/--drop-slot.
A future patch (9.5 only) adds slot management to pg_receivexlog. The
verbs create/drop don't seem descriptive enough there. It seems better
to rename pg_recvlogical's commands now, in beta, than live with the
inconsistency forever.
The old form (e.g. --drop) will still be accepted by virtue of most
getopt_long() options accepting abbreviations for long commands.
Backpatch to 9.4 where pg_recvlogical was introduced.
Author: Michael Paquier and Andres Freund
Discussion: CAB7nPqQtt79U6FmhwvgqJmNyWcVCbbV-nS72j_jyPEopERg9rg@mail.gmail.com
Tom Lane [Sun, 5 Oct 2014 18:14:04 +0000 (14:14 -0400)]
Update 9.4 release notes for commits through today.
Add entries for recent changes, including noting the JSONB format change
and the recent timezone data changes. We should remove those two items
before 9.4 final: the JSONB change will be of no interest in the long
run, and it's not normally our habit to mention timezone updates in
major-release notes. But it seems important to document them temporarily
for beta testers.
I failed to resist the temptation to wordsmith a couple of existing
entries, too.
Robert Haas [Sun, 5 Oct 2014 01:25:41 +0000 (21:25 -0400)]
Eliminate one background-worker-related flag variable.
Teach sigusr1_handler() to use the same test for whether a worker
might need to be started as ServerLoop(). Aside from being perhaps
a bit simpler, this prevents a potentially-unbounded delay when
starting a background worker. On some platforms, select() doesn't
return when interrupted by a signal, but is instead restarted,
including a reset of the timeout to the originally-requested value.
If signals arrive often enough, but no connection requests arrive,
sigusr1_handler() will be executed repeatedly, but the body of
ServerLoop() won't be reached. This change ensures that, even in
that case, background workers will eventually get launched.
This is far from a perfect fix; really, we need select() to return
control to ServerLoop() after an interrupt, either via the self-pipe
trick or some other mechanism. But that's going to require more
work and discussion, so let's do this for now to at least mitigate
the damage.
Per investigation of test_shm_mq failures on buildfarm member anole.
Tom Lane [Sat, 4 Oct 2014 18:18:19 +0000 (14:18 -0400)]
Update time zone data files to tzdata release 2014h.
Most zones in the Russian Federation are subtracting one or two hours
as of 2014-10-26. Update the meanings of the abbreviations IRKT, KRAT,
MAGT, MSK, NOVT, OMST, SAKT, VLAT, YAKT, YEKT to match.
The IANA timezone database has adopted abbreviations of the form AxST/AxDT
for all Australian time zones, reflecting what they believe to be current
majority practice Down Under. These names do not conflict with usage
elsewhere (other than ACST for Acre Summer Time, which has been in disuse
since 1994). Accordingly, adopt these names into our "Default" timezone
abbreviation set. The "Australia" abbreviation set now contains only
CST,EAST,EST,SAST,SAT,WST, all of which are thought to be mostly historical
usage. Note that SAST has also been changed to be South Africa Standard
Time in the "Default" abbreviation set.
Add zone abbreviations SRET (Asia/Srednekolymsk) and XJT (Asia/Urumqi),
and use WSST/WSDT for western Samoa.
Also a DST law change in the Turks & Caicos Islands (America/Grand_Turk),
and numerous corrections for historical time zone data.
Tom Lane [Fri, 3 Oct 2014 21:44:38 +0000 (17:44 -0400)]
Update time zone abbreviations lists.
This updates known_abbrevs.txt to be what it should have been already,
were my -P patch not broken; and updates some tznames/ entries that
missed getting any love in previous timezone data updates because zic
failed to flag the change of abbreviation.
The non-cosmetic updates:
* Remove references to "ADT" as "Arabia Daylight Time", an abbreviation
that's been out of use since 2007; therefore, claiming there is a conflict
with "Atlantic Daylight Time" doesn't seem especially helpful. (We have
left obsolete entries in the files when they didn't conflict with anything,
but that seems like a different situation.)
* Fix entirely incorrect GMT offsets for CKT (Cook Islands), FJT, FJST
(Fiji); we didn't even have them on the proper side of the date line.
(Seems to have been aboriginal errors in our tznames data; there's no
evidence anything actually changed recently.)
* FKST (Falkland Islands Summer Time) is now used all year round, so
don't mark it as a DST abbreviation.
* Update SAKT (Sakhalin) to mean GMT+11 not GMT+10.
In cosmetic changes, I fixed a bunch of wrong (or at least obsolete)
claims about abbreviations not being present in the zic files, and
tried to be consistent about how obsolete abbreviations are labeled.
Note the underlying timezone/data files are still at release 2014e;
this is just trying to get us in sync with what those files actually
say before we go to the next update.
Stephen Frost [Fri, 3 Oct 2014 20:31:53 +0000 (16:31 -0400)]
Fix CreatePolicy, pg_dump -v; psql and doc updates
Peter G pointed out that valgrind was, rightfully, complaining about
CreatePolicy() ending up copying beyond the end of the parsed policy
name. Name is a fixed-size type and we need to use namein (through
DirectFunctionCall1()) to flush out the entire array before we pass
it down to heap_form_tuple.
Michael Paquier pointed out that pg_dump --verbose was missing a
newline and Fabrízio de Royes Mello further pointed out that the
schema was also missing from the messages, so fix those also.
Also, based on an off-list comment from Kevin, rework the psql \d
output to facilitate copy/pasting into a new CREATE or ALTER POLICY
command.
Lastly, improve the pg_policies view and update the documentation for
it, along with a few other minor doc corrections based on an off-list
discussion with Adam Brightwell.
Tom Lane [Fri, 3 Oct 2014 18:48:11 +0000 (14:48 -0400)]
Fix bogus logic for zic -P option.
The quick hack I added to zic to dump out currently-in-use timezone
abbreviations turns out to have a nasty bug: within each zone, it was
printing the last "struct ttinfo" to be *defined*, not necessarily the
last one in use. This was mainly a problem in zones that had changed the
meaning of their zone abbreviation (to another GMT offset value) and later
changed it back.
As a result of this error, we'd missed out updating the tznames/ files
for some jurisdictions that have changed their zone abbreviations since
the tznames/ files were originally created. I'll address the missing data
updates in a separate commit.
Alvaro Herrera [Fri, 3 Oct 2014 16:01:27 +0000 (13:01 -0300)]
Don't balance vacuum cost delay when per-table settings are in effect
When there are cost-delay-related storage options set for a table,
trying to make that table participate in the autovacuum cost-limit
balancing algorithm produces undesirable results: instead of using the
configured values, the global values are always used,
as illustrated by Mark Kirkwood in
http://www.postgresql.org/message-id/52FACF15.8020507@catalyst.net.nz
Since the mechanism is already complicated, just disable it for those
cases rather than trying to make it cope. There are undesirable
side-effects from this too, namely that the total I/O impact on the
system will be higher whenever such tables are vacuumed. However, this
is seen as less harmful than slowing down vacuum, because that would
cause bloat to accumulate. Anyway, in the new system it is possible to
tweak options to get the precise behavior one wants, whereas with the
previous system one was simply hosed.
This has been broken forever, so backpatch to all supported branches.
This might affect systems where cost_limit and cost_delay have been set
for individual tables.
Check for GiST index tuples that don't fit on a page.
The page splitting code would go into infinite recursion if you try to
insert an index tuple that doesn't fit even on an empty page.
Per analysis and suggested fix by Andrew Gierth. Fixes bug #11555, reported
by Bryan Seitz (analysis happened over IRC). Backpatch to all supported
versions.
Robert Haas [Thu, 2 Oct 2014 17:58:50 +0000 (13:58 -0400)]
Increase the number of buffer mapping partitions to 128.
Testing by Amit Kapila, Andres Freund, and myself, with and without
other patches that also aim to improve scalability, seems to indicate
that this change is a significant win over the current value and over
smaller values such as 64. It's not clear how high we can push this
value before it starts to have negative side-effects elsewhere, but
going this far looks OK.
Tom Lane [Wed, 1 Oct 2014 23:30:24 +0000 (19:30 -0400)]
Fix some more problems with nested append relations.
As of commit a87c72915 (which later got backpatched as far as 9.1),
we're explicitly supporting the notion that append relations can be
nested; this can occur when UNION ALL constructs are nested, or when
a UNION ALL contains a table with inheritance children.
Bug #11457 from Nelson Page, as well as an earlier report from Elvis
Pranskevichus, showed that there were still nasty bugs associated with such
cases: in particular the EquivalenceClass mechanism could try to generate
"join" clauses connecting an appendrel child to some grandparent appendrel,
which would result in assertion failures or bogus plans.
Upon investigation I concluded that all current callers of
find_childrel_appendrelinfo() need to be fixed to explicitly consider
multiple levels of parent appendrels. The most complex fix was in
processing of "broken" EquivalenceClasses, which are ECs for which we have
been unable to generate all the derived equality clauses we would like to
because of missing cross-type equality operators in the underlying btree
operator family. That code path is more or less entirely untested by
the regression tests to date, because no standard opfamilies have such
holes in them. So I wrote a new regression test script to try to exercise
it a bit, which turned out to be quite a worthwhile activity as it exposed
existing bugs in all supported branches.
The present patch is essentially the same as far back as 9.2, which is
where parameterized paths were introduced. In 9.0 and 9.1, we only need
to back-patch a small fragment of commit 5b7b5518d, which fixes failure to
propagate out the original WHERE clauses when a broken EC contains constant
members. (The regression test case results show that these older branches
are noticeably stupider than 9.2+ in terms of the quality of the plans
generated; but we don't really care about plan quality in such cases,
only that the plan not be outright wrong. A more invasive fix in the
older branches would not be a good idea anyway from a plan-stability
standpoint.)
Andres Freund [Wed, 1 Oct 2014 15:22:21 +0000 (17:22 +0200)]
Refactor replication connection code of various pg_basebackup utilities.
Move some more code to manage replication connection command to
streamutil.c. A later patch will introduce replication slot via
pg_receivexlog and this avoid duplicating relevant code between
pg_receivexlog and pg_recvlogical.
Andres Freund [Mon, 29 Sep 2014 13:35:40 +0000 (15:35 +0200)]
pg_recvlogical.c code review.
Several comments still referred to 'initiating', 'freeing', 'stopping'
replication slots. These were terms used during different phases of
the development of logical decoding, but are no long accurate.
Also rename StreamLog() to StreamLogicalLog() and add 'void' to the
prototype.
Author: Michael Paquier, with some editing by me.
Backpatch to 9.4 where pg_recvlogical was introduced.
Remove num_xloginsert_locks GUC, replace with a #define
I left the GUC in place for the beta period, so that people could experiment
with different values. No-one's come up with any data that a different value
would be better under some circumstances, so rather than try to document to
users what the GUC, let's just hard-code the current value, 8.
Andres Freund [Wed, 1 Oct 2014 12:23:43 +0000 (14:23 +0200)]
Block signals while computing the sleep time in postmaster's main loop.
DetermineSleepTime() was previously called without blocked
signals. That's not good, because it allows signal handlers to
interrupt its workings.
DetermineSleepTime() was added in 9.3 with the addition of background
workers (da07a1e856511), where it only read from
BackgroundWorkerList.
Since 9.4, where dynamic background workers were added (7f7485a0cde),
the list is also manipulated in DetermineSleepTime(). That's bad
because the list now can be persistently corrupted if modified by both
a signal handler and DetermineSleepTime().
This was discovered during the investigation of hangs on buildfarm
member anole. It's unclear whether this bug is the source of these
hangs or not, but it's worth fixing either way. I have confirmed that
it can cause crashes.
It luckily looks like this only can cause problems when bgworkers are
actively used.
Add functions for dealing with PGP armor header lines to pgcrypto.
This add a new pgp_armor_headers function to extract armor headers from an
ASCII-armored blob, and a new overloaded variant of the armor function, for
constructing an ASCII-armor with extra headers.
Andres Freund [Wed, 1 Oct 2014 09:54:05 +0000 (11:54 +0200)]
Rename CACHE_LINE_SIZE to PG_CACHE_LINE_SIZE.
As noted in http://bugs.debian.org/763098 there is a conflict between
postgres' definition of CACHE_LINE_SIZE and the definition by various
*bsd platforms. It's debatable who has the right to define such a
name, but postgres' use was only introduced in 375d8526f290 (9.4), so
it seems like a good idea to rename it.
Stephen Frost [Tue, 30 Sep 2014 19:55:28 +0000 (15:55 -0400)]
Correct stdin/stdout usage in COPY .. PROGRAM
The COPY documentation incorrectly stated, for the PROGRAM case,
that we read from stdin and wrote to stdout. Fix that, and improve
consistency by referring to the 'PostgreSQL' user instead of the
'postgres' user, as is done in the rest of the COPY documentation.
Pointed out by Peter van Dijk.
Back-patch to 9.3 where COPY .. PROGRAM was introduced.