Robert Haas [Tue, 9 Sep 2014 21:45:20 +0000 (17:45 -0400)]
Change the spinlock primitives to function as compiler barriers.
Previously, they functioned as barriers against CPU reordering but not
compiler reordering, an odd API that required extensive use of volatile
everywhere that spinlocks are used. That's error-prone and has negative
implications for performance, so change it.
In theory, this makes it safe to remove many of the uses of volatile
that we currently have in our code base, but we may find that there are
some bugs in this effort when we do. In the long run, though, this
should make for much more maintainable code.
Tom Lane [Tue, 9 Sep 2014 19:34:10 +0000 (15:34 -0400)]
Add width_bucket(anyelement, anyarray).
This provides a convenient method of classifying input values into buckets
that are not necessarily equal-width. It works on any sortable data type.
The choice of function name is a bit debatable, perhaps, but showing that
there's a relationship to the SQL standard's width_bucket() function seems
more attractive than the other proposals.
The xml type previously rejected "content" that is empty or consists
only of spaces. But the SQL/XML standard allows that, so change that.
The accepted values for XML "documents" are not changed.
Stephen Frost [Tue, 9 Sep 2014 14:52:10 +0000 (10:52 -0400)]
Move ALTER ... ALL IN to ProcessUtilitySlow
Now that ALTER TABLE .. ALL IN TABLESPACE has replaced the previous
ALTER TABLESPACE approach, it makes sense to move the calls down in
to ProcessUtilitySlow where the rest of ALTER TABLE is handled.
This also means that event triggers will support ALTER TABLE .. ALL
(which was the impetus for the original change, though it has other
good qualities also).
Andres Freund [Mon, 8 Sep 2014 22:47:32 +0000 (00:47 +0200)]
Fix spinlock implementation for some !solaris sparc platforms.
Some Sparc CPUs can be run in various coherence models, ranging from
RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always
run CPUs in TSO mode while in userland, but linux didn't use to and
the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK
were only correct under TSO. Fix that by adding the necessary memory
barrier instructions. On sparcv8+, which should be all relevant CPUs,
these are treated as NOPs if the current consistency model doesn't
require the barriers.
Will be backpatched to all released branches once a few buildfarm
cycles haven't shown up problems. As I've no access to sparc, this is
blindly written.
Tom Lane [Mon, 8 Sep 2014 20:09:45 +0000 (16:09 -0400)]
Fix psql \s to work with recent libedit, and add pager support.
psql's \s (print command history) doesn't work at all with recent libedit
versions when printing to the terminal, because libedit tries to do an
fchmod() on the target file which will fail if the target is /dev/tty.
(We'd already noted this in the context of the target being /dev/null.)
Even before that, it didn't work pleasantly, because libedit likes to
encode the command history file (to ensure successful reloading), which
renders it nigh unreadable, not to mention significantly different-looking
depending on exactly which libedit version you have. So let's forget using
write_history() for this purpose, and instead print the data ourselves,
using logic similar to that used to iterate over the history for newline
encoding/decoding purposes.
While we're at it, insert the ability to use the pager when \s is printing
to the terminal. This has been an acknowledged shortcoming of \s for many
years, so while you could argue it's not exactly a back-patchable bug fix
it still seems like a good improvement. Anyone who's seriously annoyed
at this can use "\s /dev/tty" or local equivalent to get the old behavior.
Experimentation with this showed that the history iteration logic was
actually rather broken when used with libedit. It turns out that with
libedit you have to use previous_history() not next_history() to advance
to more recent history entries. The easiest and most robust fix for this
seems to be to make a run-time test to verify which function to call.
We had not noticed this because libedit doesn't really need the newline
encoding logic: its own encoding ensures that command entries containing
newlines are reloaded correctly (unlike libreadline). So the effective
behavior with recent libedits was that only the oldest history entry got
newline-encoded or newline-decoded. However, because of yet other bugs in
history_set_pos(), some old versions of libedit allowed the existing loop
logic to reach entries besides the oldest, which means there may be libedit
~/.psql_history files out there containing encoded newlines in more than
just the oldest entry. To ensure we can reload such files, it seems
appropriate to back-patch this fix, even though that will result in some
incompatibility with older psql versions (ie, multiline history entries
written by a psql with this fix will look corrupted to a psql without it,
if its libedit is reasonably up to date).
Tom Lane [Mon, 8 Sep 2014 02:40:41 +0000 (22:40 -0400)]
Documentation fix: sum(float4) returns float4, not float8.
The old claim is from my commit d06ebdb8d3425185d7e641d15e45908658a0177d of
2000-07-17, but it seems to have been a plain old thinko; sum(float4) has
been distinct from sum(float8) since Berkeley days. Noted by KaiGai Kohei.
While at it, mention the existence of sum(money), which is also of
embarrassingly ancient vintage.
The link to the NIST web page about DES standards leads to nowhere, and
according to archive.org has been forwarded to an unrelated page for
many years. Therefore, just remove that link. More up to date
information can be found via Wikipedia, for example.
Allow \watch to display query execution time if \timing is enabled.
Previously \watch could not display the query execution time even
when \timing was enabled because it used PSQLexec instead of
SendQuery and that function didn't support \timing. This patch
introduces PSQLexecWatch and changes \watch so as to use it, instead.
PSQLexecWatch is the function to run the query, print its results and
display how long it took (only when \timing is enabled).
This patch also changes --echo-hidden so that it doesn't print
the query that \watch executes. Since \watch cannot execute
backslash command queries, they should not be printed even
when --echo-hidden is set.
Patch by me, review by Heikki Linnakangas and Michael Paquier
Check number of parameters in RAISE statement at compile time.
The number of % parameter markers in RAISE statement should match the number
of parameters given. We used to check that at execution time, but we have
all the information needed at compile time, so let's check it at compile
time instead. It's generally better to find mistakes earlier.
Refactor per-page logic common to all redo routines to a new function.
Every redo routine uses the same idiom to determine what to do to a page:
check if there's a backup block for it, and if not read, the buffer if the
block exists, and check its LSN. Refactor that into a common function,
XLogReadBufferForRedo, making all the redo routines shorter and more
readable.
This has no user-visible effect, and makes no changes to the WAL format.
Reviewed by Andres Freund, Alvaro Herrera, Michael Paquier.
Andres Freund [Mon, 1 Sep 2014 11:42:43 +0000 (13:42 +0200)]
Add skip-empty-xacts option to test_decoding for use in the regression tests.
The regression tests for contrib/test_decoding regularly failed on
postgres instances that were very slow. Either because the hardware
itself was slow or because very expensive debugging options like
CLOBBER_CACHE_ALWAYS were used.
The reason they failed was just that some additional transactions were
decoded. Analyze and vacuum, triggered by autovac.
To fix just add a option to test_decoding to only display transactions
in which a change was actually displayed. That's not pretty because it
removes information from the tests; but better than constantly failing
tests in very likely harmless ways.
Backpatch to 9.4 where logical decoding was introduced.
Andres Freund [Sun, 31 Aug 2014 22:16:55 +0000 (00:16 +0200)]
Declare lwlock.c's LWLockAcquireCommon() as a static inline.
68a2e52bbaf98f136 has introduced LWLockAcquireCommon() containing the
previous contents of LWLockAcquire() plus added functionality. The
latter then calls it, just like LWLockAcquireWithVar(). Because the
majority of callers don't need the added functionality, declare the
common code as inline. The compiler then can optimize away the unused
code. Doing so is also useful when looking at profiles, to
differentiate the users.
Backpatch to 9.4, the first branch to contain LWLockAcquireCommon().
Andres Freund [Sat, 30 Aug 2014 12:03:21 +0000 (14:03 +0200)]
Make backend local tracking of buffer pins memory efficient.
Since the dawn of time (aka Postgres95) multiple pins of the same
buffer by one backend have been optimized not to modify the shared
refcount more than once. This optimization has always used a NBuffer
sized array in each backend keeping track of a backend's pins.
That array (PrivateRefCount) was one of the biggest per-backend memory
allocations, depending on the shared_buffers setting. Besides the
waste of memory it also has proven to be a performance bottleneck when
assertions are enabled as we make sure that there's no remaining pins
left at the end of transactions. Also, on servers with lots of memory
and a correspondingly high shared_buffers setting the amount of random
memory accesses can also lead to poor cpu cache efficiency.
Because of these reasons a backend's buffers pins are now kept track
of in a small statically sized array that overflows into a hash table
when necessary. Benchmarks have shown neutral to positive performance
results with considerably lower memory usage.
Fix bug in compressed GIN data leaf page splitting code.
The list of posting lists it's dealing with can contain placeholders for
deleted posting lists. The placeholders are kept around so that they can
be WAL-logged, but we must be careful to not try to access them.
This fixes bug #11280, reported by Mårten Svantesson. Backpatch to 9.4,
where the compressed data leaf page code was added.
Noah Misch [Fri, 29 Aug 2014 00:36:27 +0000 (20:36 -0400)]
Always use our getaddrinfo.c on Windows.
Commit a16bac36eca8158cbf78987e95376f610095f792 let "configure" detect
the system getaddrinfo() when building under 64-bit MinGW-w64. However,
src/include/port/win32/sys/socket.h assumes all native Windows
configurations use our replacement. This change placates buildfarm
member jacana until we establish a plan for getaddrinfo() on Windows.
Tom Lane [Thu, 28 Aug 2014 22:21:05 +0000 (18:21 -0400)]
Fix citext upgrade script for disallowance of oidvector element assignment.
In commit 45e02e3232ac7cc5ffe36f7986159b5e0b1f6fdc, we intentionally
disallowed updates on individual elements of oidvector columns. While that
still seems like a sane idea in the abstract, we (I) forgot that citext's
"upgrade from unpackaged" script did in fact perform exactly such updates,
in order to fix the problem that citext indexes should have a collation
but would not in databases dumped or upgraded from pre-9.1 installations.
Even if we wanted to add casts to allow such updates, there's no practical
way to do so in the back branches, so the only real alternative is to make
citext's kluge even klugier. In this patch, I cast the oidvector to text,
fix its contents with regexp_replace, and cast back to oidvector. (Ugh!)
Since the aforementioned commit went into all active branches, we have to
fix this in all branches that contain the now-broken update script.
As the side effect of the reverted commit, when the unit is
specified, the reloption was stored in the catalog with the unit.
This broke pg_dump (specifically, it prevented pg_dump from
outputting restorable backup regarding the reloption) and
turned the buildfarm red. Revert the commit until the fixed
version is ready.
Andres Freund [Thu, 28 Aug 2014 11:59:29 +0000 (13:59 +0200)]
Allow escaping of option values for options passed at connection start.
This is useful to allow to set GUCs to values that include spaces;
something that wasn't previously possible. The primary case motivating
this is the desire to set default_transaction_isolation to 'repeatable
read' on a per connection basis, but other usecases like seach_path do
also exist.
This introduces a slight backward incompatibility: Previously a \ in
an option value would have been passed on literally, now it'll be
taken as an escape.
The relevant mailing list discussion starts with 20140204125823.GJ12016@awork2.anarazel.de.
Fujii Masao [Thu, 28 Aug 2014 06:55:50 +0000 (15:55 +0900)]
Allow units to be specified in relation option setting value.
This introduces an infrastructure which allows us to specify the units
like ms (milliseconds) in integer relation option, like GUC parameter.
Currently only autovacuum_vacuum_cost_delay reloption can accept
the units.
Jeff Davis [Thu, 28 Aug 2014 04:07:36 +0000 (21:07 -0700)]
Allow multibyte characters as escape in SIMILAR TO and SUBSTRING.
Previously, only a single-byte character was allowed as an
escape. This patch allows it to be a multi-byte character, though it
still must be a single character.
Alvaro Herrera [Wed, 27 Aug 2014 23:15:18 +0000 (19:15 -0400)]
Fix FOR UPDATE NOWAIT on updated tuple chains
If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently
being updated, it might fail to honor its NOWAIT specification and block
instead of raising an error.
Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down
to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid
blocking while waiting for a concurrent transaction.
Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro
http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com
Per Thomas Munro in the course of his SKIP LOCKED feature submission,
who also provided one of the isolation test specs.
Backpatch to 9.4, because that's as far back as it applies without
conflicts (although the bug goes all the way back). To that branch also
backpatch Thomas Munro's new NOWAIT test cases, committed in master by
Heikki as commit 9ee16b49f0aac819bd4823d9b94485ef608b34e8 .
Stephen Frost [Wed, 27 Aug 2014 03:08:41 +0000 (23:08 -0400)]
Fix Var handling for security barrier views
In some cases, not all Vars were being correctly marked as having been
modified for updatable security barrier views, which resulted in invalid
plans (eg: when security barrier views were created over top of
inheiritance structures).
In passing, be sure to update both varattno and varonattno, as _equalVar
won't consider the Vars identical otherwise. This isn't known to cause
any issues with updatable security barrier views, but was noticed as
missing while working on RLS and makes sense to get fixed.
Back-patch to 9.4 where updatable security barrier views were
introduced.
Kevin Grittner [Tue, 26 Aug 2014 14:56:26 +0000 (09:56 -0500)]
Fix superuser concurrent refresh of matview owned by another.
Use SECURITY_LOCAL_USERID_CHANGE while building temporary tables;
only escalate to SECURITY_RESTRICTED_OPERATION while potentially
running user-supplied code. The more secure mode was preventing
temp table creation. Add regression tests to cover this problem.
This fixes Bug #11208 reported by Bruno Emanuel de Andrade Silva.
Andres Freund [Tue, 26 Aug 2014 10:21:06 +0000 (12:21 +0200)]
Specify the port in dblink and postgres_fdw tests.
That allows to run those tests against a postmaster listening on a
nonstandard port without requiring to export PGPORT in postmaster's
environment.
This still doesn't support connecting to a nondefault host without
configuring it in postmaster's environment. That's harder and less
frequently used though. So this is a useful step.
Andres Freund [Tue, 26 Aug 2014 00:54:53 +0000 (02:54 +0200)]
Don't hardcode contrib_regression dbname in postgres_fdw and dblink tests.
That allows parallel installcheck to succeed inside contrib/. The
output is not particularly pretty unless make's -O option to
synchronize the output is used.
There's other tests, outside contrib, that use a hardcoded,
non-unique, database name. Those prohibit paralell installcheck to be
used across more directories; but that's something for a separate
patch.
Bruce Momjian [Tue, 26 Aug 2014 02:19:05 +0000 (22:19 -0400)]
pg_upgrade: prevent automatic oid assignment
Prevent automatic oid assignment when in binary upgrade mode. Also
throw an error when contrib/pg_upgrade_support functions are called when
not in binary upgrade mode.
This prevent automatically-assigned oids from conflicting with later
pre-assigned oids coming from the old cluster. It also makes sure oids
are preserved in call important cases.
Alvaro Herrera [Mon, 25 Aug 2014 19:33:17 +0000 (15:33 -0400)]
Revert XactLockTableWait context setup in conditional multixact wait
There's no point in setting up a context error callback when doing
conditional lock acquisition, because we never actually wait and so the
user wouldn't be able to see the context message anywhere. In fact,
this is more in line with what ConditionalXactLockTableWait is doing.
Alvaro Herrera [Mon, 25 Aug 2014 19:32:30 +0000 (15:32 -0400)]
Use newly added InvalidCommandId instead of 0
The symbol was added by 71901ab6d; the original code was introduced by 6868ed749. Development of both overlapped which is why we apparently
failed to notice.
This is a (very slight) behavior change, so I'm not backpatching this to
9.4 for now, even though the symbol does exist there.
Alvaro Herrera [Mon, 25 Aug 2014 19:32:26 +0000 (15:32 -0400)]
DefineType: return base type OID, not its array
Event triggers want to know the OID of the interesting object created,
which is the main type. The array created as part of the operation is
just a subsidiary object which is not of much interest.
Alvaro Herrera [Mon, 25 Aug 2014 19:32:18 +0000 (15:32 -0400)]
Have CREATE TABLE AS and REFRESH return an OID
Other DDL commands are already returning the OID, which is required for
future additional event trigger work. This is merely making these
commands in line with the rest of utility command support.
Alvaro Herrera [Mon, 25 Aug 2014 17:50:19 +0000 (13:50 -0400)]
Editorial review of SET UNLOGGED
Add a succint comment explaining why it's correct to change the
persistence in this way. Also s/loggedness/persistence/ because native
speakers didn't like the latter term.
Andres Freund [Mon, 25 Aug 2014 16:30:37 +0000 (18:30 +0200)]
Fix typos in some error messages thrown by extension scripts when fed to psql.
Some of the many error messages introduced in 458857cc missed 'FROM
unpackaged'. Also e016b724 and 45ffeb7e forgot to quote extension
version numbers.
Backpatch to 9.1, just like 458857cc which introduced the messages. Do
so because the error messages thrown when the wrong command is copy &
pasted aren't easy to understand.
Tom Lane [Sun, 24 Aug 2014 15:56:52 +0000 (11:56 -0400)]
Fix another ancient memory-leak bug in relcache.c.
CheckConstraintFetch() leaked a cstring in the caller's context for each
CHECK constraint expression it copied into the relcache. Ordinarily that
isn't problematic, but it can be during CLOBBER_CACHE testing because so
many reloads can happen during a single query; so complicate the code
slightly to allow freeing the cstring after use. Per testing on buildfarm
member barnacle.
This is exactly like the leak fixed in AttrDefaultFetch() by commit 078b2ed291c758e7125d72c3a235f128d40a232b. (Yes, this time I did look for
other instances of the same coding pattern :-(.) Like that patch, no
back-patch, since it seems unlikely that there's any problem except under
very artificial test conditions.
BTW, it strikes me that both of these places would require further work
comparable to commit ab8c84db2f7af008151b848cf1d6a4672a39eecd, if we ever
supported defaults or check constraints on system catalogs: they both
assume they are copying into an empty relcache data structure, and that
conceivably wouldn't be the case during recursive reloading of a system
catalog. This does not seem worth worrying about for the moment, since
there is no near-term prospect of supporting any such thing. So I'll
just note the possibility for the archives' sake.
Peter Eisentraut [Sat, 23 Aug 2014 04:23:34 +0000 (00:23 -0400)]
doc: Improve pg_restore help output
Add a note that some options can be specified multiple times to select
multiple objects to restore. This replaces the somewhat confusing use
of plurals in the option descriptions themselves.
Tom Lane [Fri, 22 Aug 2014 17:17:58 +0000 (13:17 -0400)]
Fix corner-case behaviors in JSON/JSONB field extraction operators.
Cause the path extraction operators to return their lefthand input,
not NULL, if the path array has no elements. This seems more consistent
since the case ought to correspond to applying the simple extraction
operator (->) zero times.
Cause other corner cases in field/element/path extraction to return NULL
rather than failing. This behavior is arguably more useful than throwing
an error, since it allows an expression index using these operators to be
built even when not all values in the column are suitable for the
extraction being indexed. Moreover, we already had multiple
inconsistencies between the path extraction operators and the simple
extraction operators, as well as inconsistencies between the JSON and
JSONB code paths. Adopt a uniform rule of returning NULL rather than
throwing an error when the JSON input does not have a structure that
permits the request to be satisfied.
Back-patch to 9.4. Update the release notes to list this as a behavior
change since 9.3.
Change the way pg_basebackup's tablespace mapping is implemented.
Previously, we would first create the symlinks the way they are in the
original system, and at the end replace them with the mapped symlinks.
That never really made much sense, so now we create the symlink pointing
to the correct location to begin with, so that there's no need to fix
them at the end.
The old coding didn't work correctly on Windows, because Windows junction
points look more like directories than files, and ought to be removed with
rmdir rather than unlink. Also, it incorrectly used "%d" rather than "%u"
to print an Oid, but that's gone now.
Report and patch by Amit Kapila, with minor changes by me. Reviewed by
MauMau. Backpatch to 9.4, where the --tablespace feature was added.
Stephen Frost [Thu, 21 Aug 2014 23:06:17 +0000 (19:06 -0400)]
Rework 'MOVE ALL' to 'ALTER .. ALL IN TABLESPACE'
As 'ALTER TABLESPACE .. MOVE ALL' really didn't change the tablespace
but instead changed objects inside tablespaces, it made sense to
rework the syntax and supporting functions to operate under the
'ALTER (TABLE|INDEX|MATERIALIZED VIEW)' syntax and to be in
tablecmds.c.
Pointed out by Alvaro, who also suggested the new syntax.
Andres Freund [Thu, 21 Aug 2014 22:28:37 +0000 (00:28 +0200)]
Add pinning_backends column to the pg_buffercache extension.
The new column shows how many backends have a buffer pinned. That can
be useful during development or to diagnose production issues
e.g. caused by vacuum waiting for cleanup locks.
To handle upgrades transparently - the extension might be used in
views - deal with callers expecting the old number of columns.
Tom Lane [Wed, 20 Aug 2014 23:05:05 +0000 (19:05 -0400)]
More regression test cases for json/jsonb extraction operators.
Cover some cases I omitted before, such as null and empty-string
elements in the path array. This exposes another inconsistency:
json_extract_path complains about empty path elements but
jsonb_extract_path does not.
Tom Lane [Wed, 20 Aug 2014 20:48:35 +0000 (16:48 -0400)]
Fix core dump in jsonb #> operator, and add regression test cases.
jsonb's #> operator segfaulted (dereferencing a null pointer) if the RHS
was a zero-length array, as reported in bug #11207 from Justin Van Winkle.
json's #> operator returns NULL in such cases, so for the moment let's
make jsonb act likewise.
Also add a bunch of regression test queries memorializing the -> and #>
operators' behavior for this and other corner cases.
There is a good argument for changing some of these behaviors, as they
are not very consistent with each other, and throwing an error isn't
necessarily a desirable behavior for operators that are likely to be
used in indexes. However, everybody can agree that a core dump is the
Wrong Thing, and we need test cases even if we decide to change their
expected output later.
Use comma+space as the separator in the default search_path.
While the space is optional, it seems nicer to be consistent with what
you get if you do "SET search_path=...". SET always normalizes the
separator to be comma+space.