Tom Lane [Fri, 17 Jul 2015 19:53:09 +0000 (15:53 -0400)]
Repair mishandling of cached cast-expression trees in plpgsql.
In commit 1345cc67bbb014209714af32b5681b1e11eaf964, I introduced caching
of expressions representing type-cast operations into plpgsql. However,
I supposed that I could cache both the expression trees and the evaluation
state trees derived from them for the life of the session. This doesn't
work, because we execute the expressions in plpgsql's simple_eval_estate,
which has an ecxt_per_query_memory that is only transaction-lifespan.
Therefore we can end up putting pointers into the evaluation state tree
that point to transaction-lifespan memory; in particular this happens if
the cast expression calls a SQL-language function, as reported by Geoff
Winkless.
The minimum-risk fix seems to be to treat the state trees the same way
we do for "simple expression" trees in plpgsql, ie create them in the
simple_eval_estate's ecxt_per_query_memory, which means recreating them
once per transaction.
Since I had to introduce bookkeeping overhead for that anyway, I bought
back some of the added cost by sharing the read-only expression trees
across all functions in the session, instead of using a per-function
table as originally. The simple-expression bookkeeping takes care of
the recursive-usage risk that I was concerned about avoiding before.
At some point we should take a harder look at how all this works,
and see if we can't reduce the amount of tree reinitialization needed.
But that won't happen for 9.5.
Tom Lane [Fri, 17 Jul 2015 18:10:52 +0000 (14:10 -0400)]
Fix entirely broken permissions test in new alter_operator regression test.
Not only did this test fail to test what it was supposed to test, but it
left a user definition lying around, which caused subsequent runs of the
regression tests to fail.
xlc provides "long long" unconditionally at C99-compatible language
levels, and this option provokes a warning. The warning interferes with
"configure" tests that fail in response to any warning. Notably, before
commit 85a2a8903f7e9151793308d0638621003aded5ae, it interfered with the
test for -qnoansialias. Back-patch to 9.0 (all supported versions).
Tom Lane [Fri, 17 Jul 2015 02:57:46 +0000 (22:57 -0400)]
Fix a low-probability crash in our qsort implementation.
It's standard for quicksort implementations, after having partitioned the
input into two subgroups, to recurse to process the smaller partition and
then handle the larger partition by iterating. This method guarantees
that no more than log2(N) levels of recursion can be needed. However,
Bentley and McIlroy argued that checking to see which partition is smaller
isn't worth the cycles, and so their code doesn't do that but just always
recurses on the left partition. In most cases that's fine; but with
worst-case input we might need O(N) levels of recursion, and that means
that qsort could be driven to stack overflow. Such an overflow seems to
be the only explanation for today's report from Yiqing Jin of a SIGSEGV
in med3_tuple while creating an index of a couple billion entries with a
very large maintenance_work_mem setting. Therefore, let's spend the few
additional cycles and lines of code needed to choose the smaller partition
for recursion.
Also, fix up the qsort code so that it properly uses size_t not int for
some intermediate values representing numbers of items. This would only
be a live risk when sorting more than INT_MAX bytes (in qsort/qsort_arg)
or tuples (in qsort_tuple), which I believe would never happen with any
caller in the current core code --- but perhaps it could happen with
call sites in third-party modules? In any case, this is trouble waiting
to happen, and the corrected code is probably if anything shorter and
faster than before, since it removes sign-extension steps that had to
happen when converting between int and size_t.
In passing, move a couple of CHECK_FOR_INTERRUPTS() calls so that it's
not necessary to preserve the value of "r" across them, and prettify
the output of gen_qsort_tuple.pl a little.
Back-patch to all supported branches. The odds of hitting this issue
are probably higher in 9.4 and up than before, due to the new ability
to allocate sort workspaces exceeding 1GB, but there's no good reason
to believe that it's impossible to crash older branches this way.
AIX: Link TRANSFORM modules with their dependencies.
The result closely resembles linking of these modules for the "win32"
port. Augment the $(exports_file) header so the file is also usable as
an import file. Unfortunately, relocating an AIX installation will now
require adding $(pkglibdir) to LD_LIBRARY_PATH. Back-patch to 9.5,
where the modules were introduced.
AIX: Link the postgres executable with -Wl,-brtllib.
This allows PostgreSQL modules and their dependencies to have undefined
symbols, resolved at runtime. Perl module shared objects rely on that
in Perl 5.8.0 and later. This fixes the crash when PL/PerlU loads such
modules, as the hstore_plperl test suite does. Module authors can link
using -Wl,-G to permit undefined symbols; by default, linking will fail
as it has. Back-patch to 9.0 (all supported versions).
Add ALTER OPERATOR command, for changing selectivity estimator functions.
Other options cannot be changed, as it's not totally clear if cached plans
would need to be invalidated if one of the other options change. Selectivity
estimator functions only change plan costs, not correctness of plans, so
those should be safe.
Original patch by Uriy Zhuravlev, heavily edited by me.
In the test query I added for ALTER TABLE retaining comments, the order of
the result rows was not stable, and varied across systems. Add an ORDER BY
to make the order predictable. This should fix the buildfarm failures.
Retain comments on indexes and constraints at ALTER TABLE ... TYPE ...
When a column's datatype is changed, ATExecAlterColumnType() rebuilds all
the affected indexes and constraints, and the comments from the old
indexes/constraints were not carried over.
To fix, create a synthetic COMMENT ON command in the work queue, to re-add
any comments on constraints. For indexes, there's a comment field in
IndexStmt that is used.
This fixes bug #13126, reported by Kirill Simonov. Original patch by
Michael Paquier, reviewed by Petr Jelinek and me. This bug is present in
all versions, but only backpatch to 9.5. Given how minor the issue is, it
doesn't seem worth the work and risk to backpatch further than that.
The code in ATPostAlterTypeParse was very deeply indented, mostly because
there were two nested switch-case statements, which add a lot of
indentation. Use if-else blocks instead, to make the code less indented
and more readable.
This is in preparation for next patch that makes some actualy changes to
the function. These cosmetic parts have been separated to make it easier
to see the real changes in the other patch.
Tom Lane [Sun, 12 Jul 2015 20:25:51 +0000 (16:25 -0400)]
Fix assorted memory leaks.
Per Coverity (not that any of these are so non-obvious that they should not
have been caught before commit). The extent of leakage is probably minor
to unnoticeable, but a leak is a leak. Back-patch as necessary.
Andres Freund [Sun, 12 Jul 2015 20:06:27 +0000 (22:06 +0200)]
Optionally don't error out due to preexisting slots in commandline utilities.
pg_receivexlog and pg_recvlogical error out when --create-slot is
specified and a slot with the same name already exists. In some cases,
especially with pg_receivexlog, that's rather annoying and requires
additional scripting.
Backpatch to 9.5 as slot control functions have newly been added to
pg_receivexlog, and there doesn't seem much point leaving it in a less
useful state.
Joe Conway [Sat, 11 Jul 2015 21:19:31 +0000 (14:19 -0700)]
Add assign_expr_collations() to CreatePolicy() and AlterPolicy().
As noted by Noah Misch, CreatePolicy() and AlterPolicy() omit to call
assign_expr_collations() on the node trees. Fix the omission and add
his test case to the rowsecurity regression test.
Copy-edit the docs changes of OWNER TO CURRENT/SESSION_USER additions.
Commit 31eae602 added new syntax to many DDL commands to use CURRENT_USER
or SESSION_USER instead of role name in ALTER ... OWNER TO, but because
of a misplaced '{', the syntax in the docs implied that the syntax was
"ALTER ... CURRENT_USER", instead of "ALTER ... OWNER TO CURRENT_USER".
Fix that, and also the funny indentation in some of the modified syntax
blurps.
Tom Lane [Thu, 9 Jul 2015 22:50:31 +0000 (18:50 -0400)]
Improve documentation about array concat operator vs. underlying functions.
The documentation implied that there was seldom any reason to use the
array_append, array_prepend, and array_cat functions directly. But that's
not really true, because they can help make it clear which case is meant,
which the || operator can't do since it's overloaded to represent all three
cases. Add some discussion and examples illustrating the potentially
confusing behavior that can ensue if the parser misinterprets what was
meant.
Per a complaint from Michael Herold. Back-patch to 9.2, which is where ||
started to behave this way.
Tom Lane [Thu, 9 Jul 2015 17:22:22 +0000 (13:22 -0400)]
Fix postmaster's handling of a startup-process crash.
Ordinarily, a failure (unexpected exit status) of the startup subprocess
should be considered fatal, so the postmaster should just close up shop
and quit. However, if we sent the startup process a SIGQUIT or SIGKILL
signal, the failure is hardly "unexpected", and we should attempt restart;
this is necessary for recovery from ordinary backend crashes in hot-standby
scenarios. I attempted to implement the latter rule with a two-line patch
in commit 442231d7f71764b8c628044e7ce2225f9aa43b67, but it now emerges that
that patch was a few bricks shy of a load: it failed to distinguish the
case of a signaled startup process from the case where the new startup
process crashes before reaching database consistency. That resulted in
infinitely respawning a new startup process only to have it crash again.
To handle this properly, we really must track whether we have sent the
*current* startup process a kill signal. Rather than add yet another
ad-hoc boolean to the postmaster's state, I chose to unify this with the
existing RecoveryError flag into an enum tracking the startup process's
state. That seems more consistent with the postmaster's general state
machine design.
Make wal_compression PGC_SUSET rather than PGC_USERSET.
When enabling wal_compression, there is a risk to leak data similarly to
the BREACH and CRIME attacks on SSL where the compression ratio of
a full page image gives a hint of what is the existing data of this page.
This vulnerability is quite cumbersome to exploit in practice, but doable.
So this patch makes wal_compression PGC_SUSET in order to prevent
non-superusers from enabling it and exploiting the vulnerability while
DBA thinks the risk very seriously and disables it in postgresql.conf.
Back-patch to 9.5 where wal_compression was introduced.
Create a log file for each test run. Stdout and stderr of the test script,
as well as any subprocesses run as part of the test, are redirected to
the log file. This makes it a lot easier to debug test failures. Also print
the test output (ok 12 - ... messages) to the log file, and the command
line of any external programs executed with the system_or_bail and run_log
functions. This makes it a lot easier to debug failing tests.
Modify some of the pg_ctl and other command invocations to not use 'silent'
or 'quiet' options, and don't redirect output to /dev/null, so that you get
all the information in the log instead.
In the passing, construct some command lines in a way that works if $tempdir
contains quote-characters. I haven't systematically gone through all of
them or tested that, so I don't know if this is enough to make that work.
pg_rewind tests had a custom mechanism for creating a similar log file. Use
the new generic facility instead.
Use AS_IF rather than plain shell "if" in pthread-check.
Autoconf generates additional code for the first AC_CHECK_HEADERS call in
the script. If the first call is within an if-block, the additional code is
put inside the if-block too, even though it is needed by subsequent
AC_CHECK_HEADERS checks and should always be executed. When I moved the
pthread-related checks earlier in the script, the pthread.h test inside
the block became the very first AC_CHECK_HEADERS call in the script,
triggering that problem.
To fix, use AS_IF instead of plain shell if. AS_IF knows about that issue,
and makes sure the additional code is always executed. To be completely
safe from this issue (and others), we should always be using AS_IF instead
of plain if, but that seems like excessive caution given that this is the
first time we have trouble like this. Plain if-then is more readable than
AS_IF.
This should fix compilation with --disable-thread-safety, and hopefully the
buildfarm failure on forgmouth, related to mingw standard headers, too.
I backpatched the previous fixes to 9.5, but it's starting to look like
these changes are too fiddly to backpatch, so commit this to master only,
and revert all the pthread-related configure changes in 9.5.
The AIX 7.1 libm is static, and AIX postgres executables do not export
symbols acquired from libraries. Back-patch to 9.5, where commit cfe12763c32437bc708a64ce88a90c7544f16185 added a sqrt() call.
Revoke support for strxfrm() that write past the specified array length.
This formalizes a decision implicit in commit 4ea51cdfe85ceef8afabceb03c446574daa0ac23 and adds clean detection of
affected systems. Vendor updates are available for each such known bug.
Back-patch to 9.5, where the aforementioned commit first appeared.
POSIX does not specify the -q option, and many implementations do not
offer it. Don't bother changing the MSVC build system, because having
non-GNU diff on Windows is vanishingly unlikely. Back-patch to 9.2,
where this invocation was introduced.
Fix null pointer dereference in "\c" psql command.
The psql crash happened when no current connection existed. (The second
new check is optional given today's undocumented NULL argument handling
in PQhost() etc.) Back-patch to 9.0 (all supported versions).
Move pthread-tests earlier in the autoconf script.
On some Linux systems, "-lrt" exposed pthread-functions, so that linking
with -lrt was seemingly enough to make a program that uses pthreads to
work. However, when linking libpq, the dependency to libpthread was not
marked correctly, so that when an executable was linked with -lpq but
without -pthread, you got errors about undefined pthread_* functions from
libpq.
To fix, test for the flags required to use pthreads earlier in the autoconf
script, before checking any other libraries.
This should fix the failure on buildfarm member shearwater. gharial is also
failing; hopefully this fixes that too although the failure looks somewhat
different.
Replace our hacked version of ax_pthread.m4 with latest upstream version.
Our version was different from the upstream version in that we tried to use
all possible pthread-related flags that the compiler accepts, rather than
just the first one that works. That change was made in commit e48322a6d6cfce1ec52ab303441df329ddbc04d1, to work-around a bug affecting GCC
versions 3.2 and below (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8888),
although we didn't realize that it was a GCC bug at the time. We hardly care
about that old GCC versions anymore, so we no longer need that workaround.
This fixes the macro for compilers that print warnings with the chosen
flags. That's pretty annoying on its own right, but it also inconspicuously
disabled thread-safety, because we refused to use any pthread-related flags
if the compiler produced warnings. Max Filippov reported that problem when
linking with uClibc and OpenSSL. The warnings-check was added because the
workaround for the GCC bug caused warnings otherwise, so it's no longer
needed either. We can just use the upstream version as is.
If you really want to compile with GCC version 3.2 or older, you can still
work-around it manually by setting PTHREAD_CFLAGS="-pthread -lpthread"
manually on the configure command line.
Backpatch to 9.5. I don't want to unnecessarily rock the boat on stable
branches, but 9.5 seems like fair game.
Joe Conway [Tue, 7 Jul 2015 21:35:35 +0000 (14:35 -0700)]
Improve regression test coverage of table lock modes vs permissions.
Test the interactions with permissions and LOCK TABLE. Specifically
ROW EXCLUSIVE, ACCESS SHARE, and ACCESS EXCLUSIVE modes against
SELECT, INSERT, UPDATE, DELETE, and TRUNCATE permissions. Discussed
by Stephen Frost and Michael Paquier, patch by the latter. Backpatch
to 9.5 where matching behavior was first committed.
Tom Lane [Tue, 7 Jul 2015 16:49:18 +0000 (12:49 -0400)]
Fix portability issue in pg_upgrade test script: avoid $PWD.
SUSv2-era shells don't set the PWD variable, though anything more modern
does. In the buildfarm environment this could lead to test.sh executing
with PWD pointing to $HOME or another high-level directory, so that there
were conflicts between concurrent executions of the test in different
branch subdirectories. This appears to be the explanation for recent
intermittent failures on buildfarm members binturong and dingo (and might
well have something to do with the buildfarm script's failure to capture
log files from pg_upgrade tests, too).
To fix, just use `pwd` in place of $PWD. AFAICS test.sh is the only place
in our source tree that depended on $PWD. Back-patch to all versions
containing this script.
Per buildfarm. Thanks to Oskari Saarenmaa for diagnosing the problem.
If an allocation fails in the main message handling loop, pqParseInput3
or pqParseInput2, it should not be treated as "not enough data available
yet". Otherwise libpq will wait indefinitely for more data to arrive from
the server, and gets stuck forever.
This isn't a complete fix - getParamDescriptions and getCopyStart still
have the same issue, but it's a step in the right direction.
Michael Paquier and me. Backpatch to all supported versions.
Andres Freund [Tue, 7 Jul 2015 11:05:41 +0000 (13:05 +0200)]
Fix logical decoding bug leading to inefficient reopening of files.
When spilling transaction data to disk a simple typo caused the output
file to be closed and reopened for every serialized change. That happens
to not have a huge impact on linux, which is why it probably wasn't
noticed so far, but on windows that appears to trigger actual disk
writes after every change. Not fun.
The bug fortunately does not have any impact besides speed. A change
could end up being in the wrong segment (last instead of next), but
since we read all files to the end, that's just ugly, not really
problematic. It's not a problem to upgrade, since transaction spill
files do not persist across restarts.
Andres Freund [Tue, 7 Jul 2015 10:47:44 +0000 (12:47 +0200)]
Fix pg_recvlogical not to fsync output when it's a tty or pipe.
The previous coding tried to handle possible failures when fsyncing a
tty or pipe fd by accepting EINVAL - but apparently some
platforms (windows, OSX) don't reliably return that. So instead check
whether the output fd refers to a pipe or a tty when opening it.
Reported-By: Olivier Gosseaume, Marko Tiikkaja
Discussion: 559AF98B.3050901@joh.to
Turn install.bat into a pure one line wrapper fort he perl script.
Build.bat and vcregress.bat got similar treatment years ago. I'm not sure
why install.bat wasn't treated at the same time, but it seems like a good
idea anyway.
The immediate problem with the old install.bat was that it had quoting
issues, and wouldn't work if the target directory's name contained spaces.
This fixes that problem.
We're interested in the buffer size of the socket that's connected to the
client, not the one that's listening for new connections. It happened to
work, as default buffer size is the same on both, but it was clearly not
wrong.
Don't set SO_SNDBUF on recent Windows versions that have a bigger default.
It's unnecessary to set it if the default is higher in the first place.
Furthermore, setting SO_SNDBUF disables the so-called "dynamic send
buffering" feature, which hurts performance further. This can be seen
especially when the network between the client and the server has high
latency.
Remove incorrect warning from pg_archivecleanup document.
The .backup file name can be passed to pg_archivecleanup even if
it includes the extension which is specified in -x option.
However, previously the document incorrectly warned a user
not to do that.
Back-patch to 9.2 where pg_archivecleanup's -x option and
the warning were added.
Tom Lane [Sun, 5 Jul 2015 16:57:17 +0000 (12:57 -0400)]
Further reduce overhead for passing plpgsql variables to the executor.
This builds on commit 21dcda2713656a7483e3280ac9d2ada20a87a9a9 by keeping
a plpgsql function's shared ParamListInfo's entries for simple variables
(PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each
assignment to such variables, but saves significantly more cycles each time
they are used; so except in the pathological case of many dead stores, this
should always be a win. Initial testing says it's good for about a 10%
speedup of simple calculations; more in large functions with many datums.
We can't use this method for row/record references unfortunately, so what
we do for those is reset those ParamListInfo slots after use; which we
can skip doing unless some of them were actually evaluated during the
previous evaluation call. So this should frequently be a win as well,
while worst case is that it's similar cost to the previous approach.
Also, closer study suggests that the previous method of instantiating a
new ParamListInfo array per evaluation is actually probably optimal for
cursor-opening executor calls. The reason is that whatever is visible in
the array is going to get copied into the cursor portal via copyParamList.
So if we used the function's main ParamListInfo for those calls, we'd end
up with all of its DTYPE_VAR vars getting copied, which might well include
large pass-by-reference values that the cursor actually has no need for.
To avoid a possible net degradation in cursor cases, go back to creating
and filling a private ParamListInfo in those cases (which therefore will be
exactly the same speed as before 21dcda271365). We still get some benefit
out of this though, because this approach means that we only have to defend
against copyParamList's try-to-fetch-every-slot behavior in the case of an
unshared ParamListInfo; so plpgsql_param_fetch() can skip testing
expr->paramnos in the common case.
To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is
always valid, all assignments to such variables are now funneled through
assign_simple_var(). But this makes for cleaner and shorter code anyway.
You can no longer use pgbench with multiple threads when compiled without
--enable-thread-safety. That's an acceptable limitation these days; it
still works fine with -j1, and all modern platforms support threads anyway.
This makes future maintenance and development of the code easier.
Fix pgbench progress report behaviour when pgbench or a query gets stuck.
There were two issues here. First, if a query got stuck so that it took
e.g. 5 seconds, and progress interval was 1 second, no progress reports were
printed until the query returned. Fix so that we wake up specifically to
print the progress report. Secondly, if pgbench got stuck so that it would
nevertheless not print a progress report on time, and enough time passes
that it's already time to print the next progress report, just skip the one
that was missed. Before this patch, it would print the missed one with 0 TPS
immediately after the previous one.
Fabien Coelho. Backpatch to 9.4, where progress reports were added.
Make WAL-related utilities handle .partial WAL files properly.
Commit de76884 changed an archive recovery so that the last WAL
segment with old timeline was renamed with suffix .partial. It should
have updated WAL-related utilities so that they can handle such
.paritial WAL files, but we forgot that.
This patch changes pg_archivecleanup so that it can clean up even
archived WAL files with .partial suffix. Also it allows us to specify
.partial WAL file name as the command-line argument "oldestkeptwalfile".
This patch also changes pg_resetxlog so that it can remove .partial
WAL files in pg_xlog directory.
pg_xlogdump cannot handle .partial WAL files. Per discussion,
we decided only to document that limitation instead of adding the fix.
Because a user can easily work around the limitation (i.e., just remove
.partial suffix from the file name) and the fix seems complicated for
very narrow use case.
Back-patch to 9.5 where the problem existed.
Review by Michael Paquier.
Discussion: http://www.postgresql.org/message-id/CAHGQGwGxMKnVHGgTfiig2Bt_2djec0in3-DLJmtg7+nEiidFdQ@mail.gmail.com
Tom Lane [Thu, 2 Jul 2015 22:13:34 +0000 (18:13 -0400)]
Improve pg_restore's -t switch to match all types of relations.
-t will now match views, foreign tables, materialized views, and sequences,
not only plain tables. This is more useful, and also more consistent with
the behavior of pg_dump's -t switch, which has always matched all relation
types.
We're still not there on matching pg_dump's behavior entirely, so mention
that in the docs.
Tom Lane [Thu, 2 Jul 2015 21:24:36 +0000 (17:24 -0400)]
Make numeric form of PG version number readily available in Makefiles.
Expose PG_VERSION_NUM (e.g., "90600") as a Make variable; but for
consistency with the other Make variables holding similar info,
call the variable just VERSION_NUM not PG_VERSION_NUM.
There was some discussion of making this value available as a pg_config
value as well. However, that would entail substantially more work than
this two-line patch. Given that there was not exactly universal consensus
that we need this at all, let's just do a minimal amount of work for now.
Tom Lane [Thu, 2 Jul 2015 21:02:08 +0000 (17:02 -0400)]
Fix misuse of TextDatumGetCString().
"TextDatumGetCString(PG_GETARG_TEXT_P(x))" is formally wrong: a text*
is not a Datum. Although this coding will accidentally fail to fail on
all known platforms, it risks leaking memory if a detoast step is needed,
unlike "TextDatumGetCString(PG_GETARG_DATUM(x))" which is what's used
elsewhere. Make pg_get_object_address() fall in line with other uses.
Noted while reviewing two-arg current_setting() patch.
These variants used the old-style 'n'/' ' NULL indicators. The new-style
functions have been available since version 8.1. That should be long enough
that if there is still any old external code using these functions, they
can just switch to the new functions without worrying about backwards
compatibility
Plug some trivial memory leaks in pg_dump and pg_upgrade.
There's no point in trying to free every small allocation in these
programs that are used in a one-shot fashion, but these ones seems like
an improvement on readability grounds.
Use appendStringInfoString/Char et al where appropriate.
Patch by David Rowley. Backpatch to 9.5, as some of the calls were new in
9.5, and keeping the code in sync with master makes future backpatching
easier.
Make use of xlog_internal.h's macros in WAL-related utilities.
Commit 179cdd09 added macros to check if a filename is a WAL segment
or other such file. However there were still some instances of the
strlen + strspn combination to check for that in WAL-related utilities
like pg_archivecleanup. Those checks can be replaced with the macros.
This patch makes use of the macros in those utilities and
which would make the code a bit easier to read.
Tom Lane [Wed, 1 Jul 2015 22:55:39 +0000 (18:55 -0400)]
Don't leave pg_hba and pg_ident data lying around in running backends.
Free the contexts holding this data after we're done using it, by the
expedient of attaching them to the PostmasterContext which we were
already taking care to delete (and where, indeed, this data used to live
before commits e5e2fc842c418432 and 7c45e3a3c682f855). This saves a
probably-usually-negligible amount of space per running backend. It also
avoids leaving potentially-security-sensitive data lying around in memory
in processes that don't need it. You'd have to be unusually paranoid to
think that that amounts to a live security bug, so I've not gone so far as
to forcibly zero the memory; but there surely isn't a good reason to keep
this data around.
Arguably this is a memory management bug in the aforementioned commits,
but it doesn't seem important enough to back-patch.
Tom Lane [Wed, 1 Jul 2015 22:07:48 +0000 (18:07 -0400)]
Make sampler_random_fract() actually obey its API contract.
This function is documented to return a value in the range (0,1),
which is what its predecessor anl_random_fract() did. However, the
new version depends on pg_erand48() which returns a value in [0,1).
The possibility of returning zero creates hazards of division by zero
or trying to compute log(0) at some call sites, and it might well
break third-party modules using anl_random_fract() too. So let's
change it to never return zero. Spotted by Coverity.
XLogFileCopy() was changed heavily in commit de76884. However it was
partially reverted in commit 7abc685 and most of those changes to
XLogFileCopy() were no longer needed. Then commit 7cbee7c removed
those unnecessary code, but XLogFileCopy() looked different in master
and 9.4 though the contents are almost the same.
This patch makes XLogFileCopy() look the same in master and back-branches,
which makes back-patching easier, per discussion on pgsql-hackers.
Back-patch to 9.5.
Andres Freund [Tue, 30 Jun 2015 19:00:12 +0000 (21:00 +0200)]
Improve 9.5 release notes.
1) Add sgml comments referencing commits. This is useful to search for
missing items etc.
The comments containing the commit notes are an excerpt from:
git log --date=short \
--pretty='format:%cd [%h] %<(8,trunc)%cN: %<(48,trunc)%s%n%n%w(,4,4)%b%n' \
$(git merge-base origin/master upstream/REL9_4_STABLE)..origin/master
2) Improve a handful of existing notes
3) Add missing entries about a couple features.
4) Add a bunch of straight-forward FIXMEs
Don't call PageGetSpecialPointer() on page until it's been initialized.
After calling XLogInitBufferForRedo(), the page might be all-zeros if it was
not in page cache already. btree_xlog_unlink_page initialized the page
correctly, but it called PageGetSpecialPointer before initializing it, which
would lead to a corrupt page at WAL replay, if the unlinked page is not in
page cache.
Backpatch to 9.4, the bug came with the rewrite of B-tree page deletion.
Initialize GIN metapage correctly when replaying metapage-update WAL record.
I broke this with my WAL format refactoring patch. Before that, the metapage
was read from disk, and modified in-place regardless of the LSN. That was
always a bit silly, as there's no need to read the old page version from
disk disk when we're overwriting it anyway. So that was changed in 9.5, but
I failed to add a GinInitPage call to initialize the page-headers correctly.
Usually you wouldn't notice, because the metapage is already in the page
cache and is not zeroed.
One way to reproduce this is to perform a VACUUM on an already vacuumed
table (so that the vacuum has no real work to do), immediately after a
checkpoint, and then perform an immediate shutdown. After recovery, the
page headers of the metapage will be incorrectly all-zeroes.
Tom Lane [Mon, 29 Jun 2015 19:38:46 +0000 (15:38 -0400)]
Desultory review of 9.5 release notes.
Minor corrections and clarifications. Notably, for stuff that got moved
out of contrib, make sure it's documented somewhere other than "Additional
Modules".
I'm sure these need more work, but that's all I have time for today.
Tom Lane [Mon, 29 Jun 2015 16:42:52 +0000 (12:42 -0400)]
Code + docs review for escaping of option values (commit 11a020eb6).
Avoid memory leak from incorrect choice of how to free a StringInfo
(resetStringInfo doesn't do it). Now that pg_split_opts doesn't scribble
on the optstr, mark that as "const" for clarity. Attach the commentary in
protocol.sgml to the right place, and add documentation about the
user-visible effects of this change on postgres' -o option and libpq's
PGOPTIONS option.
Andres Freund [Mon, 29 Jun 2015 12:53:32 +0000 (14:53 +0200)]
Replace ia64 S_UNLOCK compiler barrier with a full memory barrier.
_Asm_sched_fence() is just a compiler barrier, not a memory barrier. But
spinlock release on IA64 needs, at the very least, release
semantics. Use a full barrier instead.
This might be the cause for the occasional failures on buildfarm member
anole.