Tom Lane [Wed, 1 Jun 2011 21:01:59 +0000 (17:01 -0400)]
Allow hash joins to be interrupted while searching hash table for match.
Per experimentation with a recent example, in which unreasonable amounts
of time could elapse before the backend would respond to a query-cancel.
This might be something to back-patch, but the patch doesn't apply cleanly
because this code was rewritten for 9.1. Given the lack of field
complaints I won't bother for now.
Tom Lane [Wed, 1 Jun 2011 17:09:07 +0000 (13:09 -0400)]
Further improvements in pg_ctl's new wait-for-postmaster-start logic.
Add a postmaster_is_alive() test to the wait loop, so that we stop waiting
if the postmaster dies without removing its pidfile. Unfortunately this
only helps after the postmaster has created its pidfile, since until then
we don't know which PID to check. But if it never does create the pidfile,
we can give up in a relatively short time, so this is a useful addition
in practice. Per suggestion from Fujii Masao, though this doesn't look
very much like his patch.
In addition, improve pg_ctl's ability to cope with pre-existing pidfiles.
Such a file might or might not represent a live postmaster that is going to
block our postmaster from starting, but the previous code pre-judged the
situation and gave up waiting immediately. Now, we will wait for up to 5
seconds to see if our postmaster overwrites such a file. This issue
interacts with Fujii's patch because we would make the wrong conclusion
if we did the postmaster_is_alive() test with a pre-existing PID.
All of this could be improved if we rewrote start_postmaster() so that it
could report the child postmaster's PID, so that we'd know a-priori the
correct PID to test with postmaster_is_alive(). That looks like a bit too
much change for so late in the 9.1 development cycle, unfortunately.
Tom Lane [Tue, 31 May 2011 21:53:45 +0000 (17:53 -0400)]
Protect GIST logic that assumes penalty values can't be negative.
Apparently sane-looking penalty code might return small negative values,
for example because of roundoff error. This will confuse places like
gistchoose(). Prevent problems by clamping negative penalty values to
zero. (Just to be really sure, I also made it force NaNs to zero.)
Back-patch to all supported branches.
Peter Eisentraut [Tue, 31 May 2011 20:10:05 +0000 (23:10 +0300)]
Recode non-ASCII characters in source to UTF-8
For consistency, have all non-ASCII characters from contributors'
names in the source be in UTF-8. But remove some other more
gratuitous uses of non-ASCII characters.
Tom Lane [Tue, 31 May 2011 20:10:46 +0000 (16:10 -0400)]
Replace use of credential control messages with getsockopt(LOCAL_PEERCRED).
It turns out the reason we hadn't found out about the portability issues
with our credential-control-message code is that almost no modern platforms
use that code at all; the ones that used to need it now offer getpeereid(),
which we choose first. The last holdout was NetBSD, and they added
getpeereid() as of 5.0. So far as I can tell, the only live platform on
which that code was being exercised was Debian/kFreeBSD, ie, FreeBSD kernel
with Linux userland --- since glibc doesn't provide getpeereid(), we fell
back to the control message code. However, the FreeBSD kernel provides a
LOCAL_PEERCRED socket parameter that's functionally equivalent to Linux's
SO_PEERCRED. That is both much simpler to use than control messages, and
superior because it doesn't require receiving a message from the other end
at just the right time.
Therefore, add code to use LOCAL_PEERCRED when necessary, and rip out all
the credential-control-message code in the backend. (libpq still has such
code so that it can still talk to pre-9.1 servers ... but eventually we can
get rid of it there too.) Clean up related autoconf probes, too.
This means that libpq's requirepeer parameter now works on exactly the same
platforms where the backend supports peer authentication, so adjust the
documentation accordingly.
Tom Lane [Mon, 30 May 2011 23:16:05 +0000 (19:16 -0400)]
Fix portability bugs in use of credentials control messages for peer auth.
Even though our existing code for handling credentials control messages has
been basically unchanged since 2001, it was fundamentally wrong: it did not
ensure proper alignment of the supplied buffer, and it was calculating
buffer sizes and message sizes incorrectly. This led to failures on
platforms where alignment padding is relevant, for instance FreeBSD on
64-bit platforms, as seen in a recent Debian bug report passed on by
Martin Pitt (http://bugs.debian.org//cgi-bin/bugreport.cgi?bug=612888).
Rewrite to do the message-whacking using the macros specified in RFC 2292,
following a suggestion from Theo de Raadt in that thread. Tested by me
on Debian/kFreeBSD-amd64; since OpenBSD and NetBSD document the identical
CMSG API, it should work there too.
Tom Lane [Mon, 30 May 2011 21:05:26 +0000 (17:05 -0400)]
Fix VACUUM so that it always updates pg_class.reltuples/relpages.
When we added the ability for vacuum to skip heap pages by consulting the
visibility map, we made it just not update the reltuples/relpages
statistics if it skipped any pages. But this could leave us with extremely
out-of-date stats for a table that contains any unchanging areas,
especially for TOAST tables which never get processed by ANALYZE. In
particular this could result in autovacuum making poor decisions about when
to process the table, as in recent report from Florian Helmberger. And in
general it's a bad idea to not update the stats at all. Instead, use the
previous values of reltuples/relpages as an estimate of the tuple density
in unvisited pages. This approach results in a "moving average" estimate
of reltuples, which should converge to the correct value over multiple
VACUUM and ANALYZE cycles even when individual measurements aren't very
good.
This new method for updating reltuples is used by both VACUUM and ANALYZE,
with the result that we no longer need the grotty interconnections that
caused ANALYZE to not update the stats depending on what had happened
in the parent VACUUM command.
Also, fix the logic for skipping all-visible pages during VACUUM so that it
looks ahead rather than behind to decide what to do, as per a suggestion
from Greg Stark. This eliminates useless scanning of all-visible pages at
the start of the relation or just after a not-all-visible page. In
particular, the first few pages of the relation will not be invariably
included in the scanned pages, which seems to help in not overweighting
them in the reltuples estimate.
Back-patch to 8.4, where the visibility map was introduced.
Magnus Hagander [Mon, 30 May 2011 18:46:14 +0000 (20:46 +0200)]
Don't recommend upgrading to latest available Windows SDK
We only support up to version 7.0, so don't recommend
upgrading past it. The rest of the documentation around this
was already updated, but one spot was missed.
Magnus Hagander [Mon, 30 May 2011 18:09:51 +0000 (20:09 +0200)]
Don't include local line on platforms without support
Since we now include a sample line for replication on local
connections in pg_hba.conf, don't include it where local
connections aren't available (such as on win32).
Also make sure we use authmethodlocal and not authmethod on
the sample line.
The row-version chaining in Serializable Snapshot Isolation was still wrong.
On further analysis, it turns out that it is not needed to duplicate predicate
locks to the new row version at update, the lock on the version that the
transaction saw as visible is enough. However, there was a different bug in
the code that checks for dangerous structures when a new rw-conflict happens.
Fix that bug, and remove all the row-version chaining related code.
Kevin Grittner & Dan Ports, with some comment editorialization by me.
Alvaro Herrera [Mon, 30 May 2011 16:15:13 +0000 (12:15 -0400)]
Remove usage of &PL_sv_undef in hashes and arrays
According to perlguts, &PL_sv_undef is not the right thing to use in
those cases because it doesn't behave the same way as an undef value via
Perl code. Seems the intuitive way to deal with undef values is subtly
enough broken that it's hard to notice when misused.
The broken uses got inadvertently introduced in commit 87bb2ade2ce646083f39d5ab3e3307490211ad04 by Alexey Klyukin, Alex
Hunsaker and myself on 2011-02-17; no backpatch is necessary.
Tom Lane [Sat, 28 May 2011 16:36:04 +0000 (12:36 -0400)]
Fix null-dereference crash in parse_xml_decl().
parse_xml_decl's header comment says you can pass NULL for any unwanted
output parameter, but it failed to honor this contract for the "standalone"
flag. The only currently-affected caller is xml_recv, so the net effect is
that sending a binary XML value containing a standalone parameter in its
xml declaration would crash the backend. Per bug #6044 from Christopher
Dillard.
In passing, remove useless initializations of parse_xml_decl's output
parameters in xml_parse.
Back-patch to 8.3, where this code was introduced.
Tom Lane [Fri, 27 May 2011 18:13:38 +0000 (14:13 -0400)]
Improve corner cases in pg_ctl's new wait-for-postmaster-startup code.
With "-w -t 0", we should report "still starting up", not "ok". If we
fall out of the loop without ever being able to call PQping (because we
were never able to construct a connection string), report "no response",
not "ok". This gets rid of corner cases in which we'd claim the server
had started even though it had not.
Also, if the postmaster.pid file is not there at any point after we've
waited 5 seconds, assume the postmaster has failed and report that, rather
than almost-certainly-fruitlessly continuing to wait. The pidfile should
appear almost instantly even when there is extensive startup work to do,
so 5 seconds is already a very conservative figure. This part is per a
gripe from MauMau --- there might be better ways to do it, but nothing
simple enough to get done for 9.1.
Tom Lane [Fri, 27 May 2011 16:10:32 +0000 (12:10 -0400)]
Preserve caller's memory context in ProcessCompletedNotifies().
This is necessary to avoid long-term memory leakage, because the main loop
in PostgresMain expects to be executing in MessageContext, and hence is a
bit sloppy about freeing stuff that is only needed for the duration of
processing the current client message. The known case of an actual leak
is when encoding conversion has to be done on the incoming command string,
but there might be others. Per report from Per-Olov Esgard.
Back-patch to 9.0, where the bug was introduced by the LISTEN/NOTIFY
rewrite.
Check the return code of pthread_create(). Otherwise we go into an infinite
loop if it fails, which is what what happened on my HP-UX box. (I think
the reason it failed on that box is a misconfiguration on my behalf, but
that's no reason to hang.)
Tom Lane [Thu, 26 May 2011 23:25:19 +0000 (19:25 -0400)]
Make decompilation of optimized CASE constructs more robust.
We had some hacks in ruleutils.c to cope with various odd transformations
that the optimizer could do on a CASE foo WHEN "CaseTestExpr = RHS" clause.
However, the fundamental impossibility of covering all cases was exposed
by Heikki, who pointed out that the "=" operator could get replaced by an
inlined SQL function, which could contain nearly anything at all. So give
up on the hacks and just print the expression as-is if we fail to recognize
it as "CaseTestExpr = RHS". (We must cover that case so that decompiled
rules print correctly; but we are not under any obligation to make EXPLAIN
output be 100% valid SQL in all cases, and already could not do so in some
other cases.) This approach requires that we have some printable
representation of the CaseTestExpr node type; I used "CASE_TEST_EXPR".
Back-patch to all supported branches, since the problem case fails in all.
Tom Lane [Thu, 26 May 2011 21:29:33 +0000 (17:29 -0400)]
Adjust configure to use "+Olibmerrno" with HP-UX C compiler, if possible.
This is reported to be necessary on some versions of that OS. In service
of this, cause PGAC_PROG_CC_CFLAGS_OPT to reject switches that result in
compiler warnings, since on yet other versions of that OS, the switch does
nothing except provoke a warning.
Report and patch by Ibrar Ahmed, further tweaking by me.
Tom Lane [Wed, 25 May 2011 20:26:45 +0000 (16:26 -0400)]
Suppress extensions in partial dumps.
We initially had pg_dump emit CREATE EXTENSION commands unconditionally.
However, pg_dump has long been in the habit of not dumping procedural
language definitions when a --schema or --table switch is given. It seems
appropriate to handle extensions the same way, since like PLs they are SQL
objects that are not in any particular schema. Per complaint from Adrian
Schreyer.
Peter Eisentraut [Wed, 25 May 2011 18:53:26 +0000 (21:53 +0300)]
Put options in some sensible order
For the --help output and reference pages of pg_dump, pg_dumpall,
pg_restore, put the options in some consistent, mostly alphabetical,
and consistent order, rather than newest option last or something like
that.
Andrew Dunstan [Wed, 25 May 2011 04:21:07 +0000 (00:21 -0400)]
Convert builddoc.bat into a perl script that actually works.
The old .bat file wasn't working for reasons that are unclear, and
which it did not seem worth the trouble to ascertain.
The new perl script has been tested and is known to work.
Soon it will be tested regularly on the buildfarm.
The .bat file is kept as a simple wrapper for the perl script.
Tom Lane [Tue, 24 May 2011 21:56:52 +0000 (17:56 -0400)]
Cleanup for pull-up-isReset patch.
Clear isReset before, not after, calling the context-specific alloc method,
so as to preserve the option to do a tail call in MemoryContextAlloc
(and also so this code isn't assuming that a failed alloc call won't have
changed the context's state before failing). Fix missed direct invocation
of reset method. Reformat a comment.
Tom Lane [Mon, 23 May 2011 20:34:27 +0000 (16:34 -0400)]
Make plpgsql complain about conflicting IN and OUT parameter names.
The core CREATE FUNCTION code only enforces that IN parameter names are
non-duplicate, and that OUT parameter names are separately non-duplicate.
This is because some function languages might not have any confusion
between the two. But in plpgsql, such names are all in the same namespace,
so we'd better disallow it.
Per a recent complaint from Dan S. Not back-patching since this is a small
issue and the change could cause unexpected failures if we started to
enforce it in a minor release.
Tom Lane [Mon, 23 May 2011 16:52:46 +0000 (12:52 -0400)]
Install defenses against overflow in BuildTupleHashTable().
The planner can sometimes compute very large values for numGroups, and in
cases where we have no alternative to building a hashtable, such a value
will get fed directly to BuildTupleHashTable as its nbuckets parameter.
There were two ways in which that could go bad. First, BuildTupleHashTable
declared the parameter as "int" but most callers were passing "long"s,
so on 64-bit machines undetected overflow could occur leading to a bogus
negative value. The obvious fix for that is to change the parameter to
"long", which is what I've done in HEAD. In the back branches that seems a
bit risky, though, since third-party code might be calling this function.
So for them, just put in a kluge to treat negative inputs as INT_MAX.
Second, hash_create can go nuts with extremely large requested table sizes
(notably, my_log2 becomes an infinite loop for inputs larger than
LONG_MAX/2). What seems most appropriate to avoid that is to bound the
initial table size request to work_mem.
This fixes bug #6035 reported by Daniel Schreiber. Although the reported
case only occurs back to 8.4 since it involves WITH RECURSIVE, I think
it's a good idea to install the defenses in all supported branches.
Tom Lane [Sun, 22 May 2011 19:13:35 +0000 (15:13 -0400)]
Make plpgsql provide the typmods for its variables to the main parser.
Historically we didn't do this, even though we had the information, because
plpgsql passed its Params via SPI APIs that only include type OIDs not
typmods. Now that plpgsql uses parser callbacks to create Params, it's
easy to insert the right typmod. This should generally result in lower
surprise factors, because a plpgsql variable that is declared with a typmod
will now work more like a table column with the same typmod. In particular
it's the "right" way to fix bug #6020, in which plpgsql's attempt to return
an anonymous record type is defeated by stricter record-type matching
checks that were added in 9.0. However, it's not impossible that this
could result in subtle behavioral changes that could break somebody's
existing plpgsql code, so I'm afraid to back-patch this change into
released branches. In those branches we'll have to lobotomize the
record-type checks instead.
Pull up isReset flag from AllocSetContext to MemoryContext struct. This
avoids the overhead of one function call when calling MemoryContextReset(),
and it seems like the isReset optimization would be applicable to any new
memory context we might invent in the future anyway.
This buys back the overhead I just added in previous patch to always call
MemoryContextReset() in ExecScan, even when there's no quals or projections.
Reset per-tuple memory context between every row in a scan node, even when
there's no quals or projections. Currently this only matters for foreign
scans, as none of the other scan nodes litter the per-tuple memory context
when there's no quals or projections.
Peter Eisentraut [Thu, 19 May 2011 19:56:53 +0000 (22:56 +0300)]
Fix untranslatable assembly of libpq connection failure message
Even though this only affects the insertion of a parenthesized word,
it's unwise to assume that parentheses can pass through untranslated.
And in any case, the new version is clearer in the code and for
translators.
Peter Eisentraut [Thu, 19 May 2011 18:36:57 +0000 (21:36 +0300)]
Consistent spacing for lengthy error messages
Also, we removed the display of the current value of
max_connections/MaxBackends from some messages earlier, because it was
confusing, so do that in the remaining one as well.
Alvaro Herrera [Thu, 19 May 2011 03:56:18 +0000 (23:56 -0400)]
Fix declaration of $_TD in "strict" trigger functions
This was broken in commit ef19dc6d39dd2490ff61489da55d95d6941140bf by
the Bunce/Hunsaker/Dunstan team, which moved the declaration from
plperl_create_sub to plperl_call_perl_trigger_func. This doesn't
actually work because the validator code would not find the variable
declared; and even if you manage to get past the validator, it still
doesn't work because get_sv("_TD", GV_ADD) doesn't have the expected
effect. The only reason this got beyond testing is that it only fails
in strict mode.
We need to declare it as a global just like %_SHARED; it is simpler than
trying to actually do what the patch initially intended, and is said to
have the same performance benefit.
As a more serious issue, fix $_TD not being properly local()ized,
meaning nested trigger functions would clobber $_TD.
Tom Lane [Mon, 16 May 2011 20:41:52 +0000 (16:41 -0400)]
Fix pg_dump's handling of extension-member casts and languages.
pg_dump has some heuristic rules for whether to dump casts and procedural
languages, since it's not all that easy to distinguish built-in ones from
user-defined ones. However, we should not apply those rules to objects
that belong to an extension, but just use the perfectly well-defined rules
for what to do with extension member objects. Otherwise we might
mistakenly lose extension member objects during a binary upgrade (which is
the only time that we'd want to dump extension members).
Robert Haas [Fri, 13 May 2011 19:47:31 +0000 (15:47 -0400)]
More cleanup of FOREIGN TABLE permissions handling.
This commit fixes psql, pg_dump, and the information schema to be
consistent with the backend changes which I made as part of commit be90032e0d1cf473bdd99aee94218218f59f29f1, and also includes a
related documentation tweak.
Tom Lane [Thu, 12 May 2011 15:56:38 +0000 (11:56 -0400)]
Fix write-past-buffer-end in ldapServiceLookup().
The code to assemble ldap_get_values_len's output into a single string
wrote the terminating null one byte past where it should. Fix that,
and make some other cosmetic adjustments to make the code a trifle more
readable and more in line with usual Postgres coding style.
Also, free the "result" string when done with it, to avoid a permanent
memory leak.
Bug report and patch by Albe Laurenz, cosmetic adjustments by me.
Tom Lane [Wed, 11 May 2011 23:57:38 +0000 (19:57 -0400)]
Split PGC_S_DEFAULT into two values, for true boot_val vs computed default.
Failure to distinguish these cases is the real cause behind the recent
reports of Windows builds crashing on 'infinity'::timestamp, which was
directly due to failure to establish a value of timezone_abbreviations
in postmaster child processes. The postmaster had the desired value,
but write_one_nondefault_variable() didn't transmit it to backends.
To fix that, invent a new value PGC_S_DYNAMIC_DEFAULT, and be sure to use
that or PGC_S_ENV_VAR (as appropriate) for "default" settings that are
computed during initialization. (We need both because there's at least
one variable that could receive a value from either source.)
This commit also fixes ProcessConfigFile's failure to restore the correct
default value for certain GUC variables if they are set in postgresql.conf
and then removed/commented out of the file. We have to recompute and
reinstall the value for any GUC variable that could have received a value
from PGC_S_DYNAMIC_DEFAULT or PGC_S_ENV_VAR sources, and there were a
number of oversights. (That whole thing is a crock that needs to be
redesigned, but not today.)
However, I intentionally didn't make it work "exactly right" for the cases
of timezone and log_timezone. The exactly right behavior would involve
running select_default_timezone, which we'd have to do independently in
each postgres process, causing the whole database to become entirely
unresponsive for as much as several seconds. That didn't seem like a good
idea, especially since the variable's removal from postgresql.conf might be
just an accidental edit. Instead the behavior is to adopt the previously
active setting as if it were default.
Note that this patch creates an ABI break for extensions that use any of
the PGC_S_XXX constants; they'll need to be recompiled.
Tom Lane [Wed, 11 May 2011 18:43:01 +0000 (14:43 -0400)]
Clean up parsing of CREATE TRIGGER's argument list.
Use ColLabel in place of ColId, so that reserved words are accepted as if
they were not reserved. Also, remove BCONST and XCONST, which were never
documented as allowed. Allowing those exposes to users an implementation
detail, namely the format in which the lexer outputs such constants, that
seems unwise to expose.
No documentation change needed, since this just makes the code act more
like you'd expect from reading the CREATE TRIGGER man page.
Per complaint from Szymon Guz and subsequent discussion.
Shut down WAL receiver if it's still running at end of recovery. We used to
just check that it's not running and PANIC if it was, but that can rightfully
happen if recovery stops at recovery target.
Tom Lane [Wed, 11 May 2011 00:36:22 +0000 (20:36 -0400)]
Prevent datebsearch() from crashing on base == NULL && nel == 0.
Normally nel == 0 works okay because the initial value of "last" will be
less than "base"; but if "base" is zero then the calculation wraps around
and we have a very large (unsigned) value for "last", so that the loop can
be entered and we get a SIGSEGV on a bogus pointer.
This is certainly the proximate cause of the recent reports of Windows
builds crashing on 'infinity'::timestamp --- evidently, they're either not
setting an active timezonetktbl, or setting an empty one. It's not yet
clear to me why it's only happening on Windows and not happening on any
buildfarm member. But even if that's due to some bug elsewhere, it seems
wise for this function to not choke on the powerup values of
timezonetktbl/sztimezonetktbl.
I also changed the copy of this code in ecpglib, although I am not sure
whether it's exposed to a similar hazard.