Peter Eisentraut [Mon, 30 Apr 2012 18:12:28 +0000 (21:12 +0300)]
Fix display of <command> elements on man pages
We had changed this from the default bold to monospace for all output
formats, but for man pages, this creates visual inconsistencies, so
revert to the default for man pages.
Tom Lane [Mon, 30 Apr 2012 18:02:47 +0000 (14:02 -0400)]
Converge all SQL-level statistics timing values to float8 milliseconds.
This patch adjusts the core statistics views to match the decision already
taken for pg_stat_statements, that values representing elapsed time should
be represented as float8 and measured in milliseconds. By using float8,
we are no longer tied to a specific maximum precision of timing data.
(Internally, it's still microseconds, but we could now change that without
needing changes at the SQL level.)
The columns affected are
pg_stat_bgwriter.checkpoint_write_time
pg_stat_bgwriter.checkpoint_sync_time
pg_stat_database.blk_read_time
pg_stat_database.blk_write_time
pg_stat_user_functions.total_time
pg_stat_user_functions.self_time
pg_stat_xact_user_functions.total_time
pg_stat_xact_user_functions.self_time
The first four of these are new in 9.2, so there is no compatibility issue
from changing them. The others require a release note comment that they
are now double precision (and can show a fractional part) rather than
bigint as before; also their underlying statistics functions now match
the column definitions, instead of returning bigint microseconds.
Tom Lane [Sun, 29 Apr 2012 19:35:57 +0000 (15:35 -0400)]
Further editorialization on the new documentation for statistics views.
Get rid of the per-column documentation of underlying functions, which did
far more to clutter the view descriptions than it did to be helpful, and
was rather incomplete and typo-ridden anyway. Instead suggest that people
consult the definitions of the standard views to see the underlying
functions.
The older functions for obtaining individual facts about backends are now
somewhat obsoleted by pg_stat_get_activity, which means that they are not
documented by any standard view. So I put that information into a separate
table. (Maybe we should just deprecate them instead?)
In passing, fix a couple more documentation errors.
Peter Eisentraut [Sun, 29 Apr 2012 18:07:35 +0000 (21:07 +0300)]
Change return type of ExceptionalCondition to void and mark it noreturn
In ancient times, it was thought that this wouldn't work because of
TrapMacro/AssertMacro, but changing those to use a comma operator
appears to work without compiler warnings.
Tom Lane [Sat, 28 Apr 2012 20:03:57 +0000 (16:03 -0400)]
Adjust timing units in pg_stat_statements.
Display total time and I/O timings in milliseconds, for consistency with
the units used for timings in the core statistics views. The columns
remain of float8 type, so that sub-msec precision is available. (At some
point we will probably want to convert the core views to use float8 type
for the same reason, but this patch does not touch that issue.)
This is a release-note-requiring change in the meaning of the total_time
column. The I/O timing columns are new as of 9.2, so there is no
compatibility impact from redefining them.
Do some minor copy-editing in the documentation, too.
Tom Lane [Fri, 27 Apr 2012 23:49:18 +0000 (19:49 -0400)]
Fix printing of whole-row Vars at top level of a SELECT targetlist.
Normally whole-row Vars are printed as "tabname.*". However, that does not
work at top level of a targetlist, because per SQL standard the parser will
think that the "*" should result in column-by-column expansion; which is
not at all what a whole-row Var implies. We used to just print the table
name in such cases, which works most of the time; but it fails if the table
name matches a column name available anywhere in the FROM clause. This
could lead for instance to a view being interpreted differently after dump
and reload. Adding parentheses doesn't fix it, but there is a reasonably
simple kluge we can use instead: attach a no-op cast, so that the "*" isn't
syntactically at top level anymore. This makes the printing of such
whole-row Vars a lot more consistent with other Vars, and may indeed fix
more cases than just the reported one; I'm suspicious that cases involving
schema qualification probably didn't work properly before, either.
Per bug report and fix proposal from Abbas Butt, though this patch is quite
different in detail from his.
Tom Lane [Fri, 27 Apr 2012 04:12:42 +0000 (00:12 -0400)]
Fix syslogger's rotation disable/re-enable logic.
If it fails to open a new log file, the syslogger assumes there's something
wrong with its parameters (such as log_directory), and stops attempting
automatic time-based or size-based log file rotations. Sending it SIGHUP
is supposed to start that up again. However, the original coding for that
was really bogus, involving clobbering a couple of GUC variables and hoping
that SIGHUP processing would restore them. Get rid of that technique in
favor of maintaining a separate flag showing we've turned rotation off.
Per report from Mark Kirkwood.
Also, the syslogger will automatically attempt to create the log_directory
directory if it doesn't exist, but that was only happening at startup.
For consistency and ease of use, it should do the same whenever the value
of log_directory is changed by SIGHUP.
Robert Haas [Fri, 27 Apr 2012 00:00:21 +0000 (20:00 -0400)]
Prevent index-only scans from returning wrong answers under Hot Standby.
The alternative of disallowing index-only scans in HS operation was
discussed, but the consensus was that it was better to treat marking
a page all-visible as a recovery conflict for snapshots that could still
fail to see XIDs on that page. We may in the future try to soften this,
so that we simply force index scans to do heap fetches in cases where
this may be an issue, rather than throwing a hard conflict.
Tom Lane [Thu, 26 Apr 2012 22:28:52 +0000 (18:28 -0400)]
Improve documentation around historical calendar rules.
Get rid of section 8.5.6 (Date/Time Internals), which appears to confuse
people more than it helps, and anyway discussion of Postgres' internal
datetime calculation methods seems pretty out of place here. Instead,
make datatype.sgml just say that we follow the Gregorian calendar (a bit
of specification not previously present anywhere in that chapter :-()
and link to the History of Units appendix for more info. Do some mild
editorialization on that appendix, too, to make it clearer that we are
following proleptic Gregorian calendar rules rather than anything more
historically accurate.
Per a question from Florence Cousin and subsequent discussion in
pgsql-docs.
Tom Lane [Thu, 26 Apr 2012 18:17:13 +0000 (14:17 -0400)]
Fix oversight in recent parameterized-path patch.
bitmap_scan_cost_est() has to be able to cope with a BitmapOrPath, but
I'd taken a shortcut that didn't work for that case. Noted by Heikki.
Add some regression tests since this area is evidently under-covered.
Peter Eisentraut [Thu, 26 Apr 2012 18:03:48 +0000 (21:03 +0300)]
PL/Python: Accept strings in functions returning composite types
Before 9.1, PL/Python functions returning composite types could return
a string and it would be parsed using record_in. The 9.1 changes made
PL/Python only expect dictionaries, tuples, or objects supporting
getattr as output of composite functions, resulting in a regression
and a confusing error message, as the strings were interpreted as
sequences and the code for transforming lists to database tuples was
used. Fix this by treating strings separately as before, before
checking for the other types.
The reason why it's important to support string to database tuple
conversion is that trigger functions on tables with composite columns
get the composite row passed in as a string (from record_out).
Without supporting converting this back using record_in, this makes it
impossible to implement pass-through behavior for these columns, as
PL/Python no longer accepts strings for composite values.
A better solution would be to fix the code that transforms composite
inputs into Python objects to produce dictionaries that would then be
correctly interpreted by the Python->PostgreSQL counterpart code. But
that would be too invasive to backpatch to 9.1, and it is too late in
the 9.2 cycle to attempt it. It should be revisited in the future,
though.
Tom Lane [Thu, 26 Apr 2012 02:57:48 +0000 (22:57 -0400)]
Modify create_index regression test to avoid intermittent failures.
We have been seeing intermittent buildfarm failures due to a query
sometimes not using an index-only scan plan, because a background
auto-ANALYZE prevented the table's all-visible bits from being set
immediately, thereby causing the estimated cost of an index-only scan
to go up considerably. Adjust the test case so that a bitmap index scan is
preferred instead, which serves equally well for the purpose the test case
is actually meant for. (Of course, it would be better to eliminate the
interference from auto-ANALYZE, but I see no low-risk way to do that,
so any such fix will have to be left for 9.3 or later.)
Tom Lane [Thu, 26 Apr 2012 00:20:33 +0000 (20:20 -0400)]
Fix planner's handling of RETURNING lists in writable CTEs.
setrefs.c failed to do "rtoffset" adjustment of Vars in RETURNING lists,
which meant they were left with the wrong varnos when the RETURNING list
was in a subquery. That was never possible before writable CTEs, of
course, but now it's broken. The executor fails to notice any problem
because ExecEvalVar just references the ecxt_scantuple for any normal
varno; but EXPLAIN breaks when the varno is wrong, as illustrated in a
recent complaint from Bartosz Dmytrak.
Since the eventual rtoffset of the subquery is not known at the time
we are preparing its plan node, the previous scheme of executing
set_returning_clause_references() at that time cannot handle this
adjustment. Fortunately, it turns out that we don't really need to do it
that way, because all the needed information is available during normal
setrefs.c execution; we just have to dig it out of the ModifyTable node.
So, do that, and get rid of the kluge of early setrefs processing of
RETURNING lists. (This is a little bit of a cheat in the case of inherited
UPDATE/DELETE, because we are not passing a "root" struct that corresponds
exactly to what the subplan was built with. But that doesn't matter, and
anyway this is less ugly than early setrefs processing was.)
Back-patch to 9.1, where the problem became possible to hit.
Tom Lane [Wed, 25 Apr 2012 21:25:12 +0000 (17:25 -0400)]
Fix edge-case behavior of pg_next_dst_boundary().
Due to rather sloppy thinking (on my part, I'm afraid) about the
appropriate behavior for boundary conditions, pg_next_dst_boundary() gave
undefined, platform-dependent results when the input time is exactly the
last recorded DST transition time for the specified time zone, as a result
of fetching values one past the end of its data arrays.
Change its specification to be that it always finds the next DST boundary
*after* the input time, and adjust code to match that. The sole existing
caller, DetermineTimeZoneOffset, doesn't actually care about this
distinction, since it always uses a probe time earlier than the instant
that it does care about. So it seemed best to me to change the API to make
the result=1 and result=0 cases more consistent, specifically to ensure
that the "before" outputs always describe the state at the given time,
rather than hacking the code to obey the previous API comment exactly.
Per bug #6605 from Sergey Burladyan. Back-patch to all supported versions.
Robert Haas [Tue, 24 Apr 2012 13:20:53 +0000 (09:20 -0400)]
Casts to or from a domain type are ignored; warn and document.
Prohibiting this outright would break dumps taken from older versions
that contain such casts, which would create far more pain than is
justified here.
Per report by Jaime Casanova and subsequent discussion.
Robert Haas [Tue, 24 Apr 2012 02:08:06 +0000 (22:08 -0400)]
Rearrange lazy_scan_heap to avoid visibility map race conditions.
We must set the visibility map bit before releasing our exclusive lock
on the heap page; otherwise, someone might clear the heap page bit
before we set the visibility map bit, leading to a situation where the
visibility map thinks the page is all-visible but it's really not.
This problem has existed since 8.4, but it wasn't critical before we
had index-only scans, since the worst case scenario was that the page
wouldn't get vacuumed until the next scan_all vacuum.
Along the way, a couple of minor, related improvements: (1) if we
pause the heap scan to do an index vac cycle, release any visibility
map page we're holding, since really long-running pins are not good
for a variety of reasons; and (2) warn if we see a page that's marked
all-visible in the visibility map but not on the page level, since
that should never happen any more (it was allowed in previous
releases, but not in 9.2).
Tom Lane [Sat, 21 Apr 2012 04:51:14 +0000 (00:51 -0400)]
Use fuzzy not exact cost comparison for the final tie-breaker in add_path.
Instead of an exact cost comparison, use a fuzzy comparison with 1e-10
delta after all other path metrics have proved equal. This is to avoid
having platform-specific roundoff behaviors determine the choice when
two paths are really the same to our cost estimators. Adjust the
recently-added test case that made it obvious we had a problem here.
Recast "ONLY" column CHECK constraints as NO INHERIT
The original syntax wasn't universally loved, and it didn't allow its
usage in CREATE TABLE, only ALTER TABLE. It now works everywhere, and
it also allows using ALTER TABLE ONLY to add an uninherited CHECK
constraint, per discussion.
The pg_constraint column has accordingly been renamed connoinherit.
This commit partly reverts some of the changes in 61d81bd28dbec65a6b144e0cd3d0bfe25913c3ac, particularly some pg_dump and
psql bits, because now pg_get_constraintdef includes the necessary NO
INHERIT within the constraint definition.
Tom Lane [Sat, 21 Apr 2012 00:10:46 +0000 (20:10 -0400)]
Adjust join_search_one_level's handling of clauseless joins.
For an initial relation that lacks any join clauses (that is, it has to be
cartesian-product-joined to the rest of the query), we considered only
cartesian joins with initial rels appearing later in the initial-relations
list. This creates an undesirable dependency on FROM-list order. We would
never fail to find a plan, but perhaps we might not find the best available
plan. Noted while discussing the logic with Amit Kapila.
Improve the comments a bit in this area, too.
Arguably this is a bug fix, but given the lack of complaints from the
field I'll refrain from back-patching.
Tom Lane [Thu, 19 Apr 2012 19:52:46 +0000 (15:52 -0400)]
Revise parameterized-path mechanism to fix assorted issues.
This patch adjusts the treatment of parameterized paths so that all paths
with the same parameterization (same set of required outer rels) for the
same relation will have the same rowcount estimate. We cache the rowcount
estimates to ensure that property, and hopefully save a few cycles too.
Doing this makes it practical for add_path_precheck to operate without
a rowcount estimate: it need only assume that paths with different
parameterizations never dominate each other, which is close enough to
true anyway for coarse filtering, because normally a more-parameterized
path should yield fewer rows thanks to having more join clauses to apply.
In add_path, we do the full nine yards of comparing rowcount estimates
along with everything else, so that we can discard parameterized paths that
don't actually have an advantage. This fixes some issues I'd found with
add_path rejecting parameterized paths on the grounds that they were more
expensive than not-parameterized ones, even though they yielded many fewer
rows and hence would be cheaper once subsequent joining was considered.
To make the same-rowcounts assumption valid, we have to require that any
parameterized path enforce *all* join clauses that could be obtained from
the particular set of outer rels, even if not all of them are useful for
indexing. This is required at both base scans and joins. It's a good
thing anyway since the net impact is that join quals are checked at the
lowest practical level in the join tree. Hence, discard the original
rather ad-hoc mechanism for choosing parameterization joinquals, and build
a better one that has a more principled rule for when clauses can be moved.
The original rule was actually buggy anyway for lack of knowledge about
which relations are part of an outer join's outer side; getting this right
requires adding an outer_relids field to RestrictInfo.
Robert Haas [Wed, 18 Apr 2012 15:17:30 +0000 (11:17 -0400)]
Tighten up error recovery for fast-path locking.
The previous code could cause a backend crash after BEGIN; SAVEPOINT a;
LOCK TABLE foo (interrupted by ^C or statement timeout); ROLLBACK TO
SAVEPOINT a; LOCK TABLE foo, and might have leaked strong-lock counts
in other situations.
Report by Zoltán Böszörményi; patch review by Jeff Davis.
Robert Haas [Wed, 18 Apr 2012 14:49:37 +0000 (10:49 -0400)]
After PageSetAllVisible, use MarkBufferDirty.
Previously, we used SetBufferCommitInfoNeedsSave, but that's really
intended for dirty-marks we can theoretically afford to lose, such as
hint bits. As for 9.2, the PD_ALL_VISIBLE mustn't be lost in this
way, since we could then end up with a heap page that isn't
all-visible and a visibility map page that is all visible, causing
index-only scans to return wrong answers.
Robert Haas [Wed, 18 Apr 2012 14:43:16 +0000 (10:43 -0400)]
Fix various infelicities in node functions.
Mostly, this consists of adding support for fields which exist in the
structure but aren't handled by copy/equal/outfuncs; but the create
foreign table case can actually produce garbage output.
Andrew Dunstan [Tue, 17 Apr 2012 22:30:34 +0000 (18:30 -0400)]
Don't override arguments set via options with positional arguments.
A number of utility programs were rather careless about paremeters
that can be set via both an option argument and a positional
argument. This leads to results which can violate the Principal
Of Least Astonishment. These changes refuse to use positional
arguments to override settings that have been made via positional
arguments. The changes are backpatched to all live branches.
Don't wait for the commit record to be replicated if we wrote no WAL.
When using synchronous replication, we waited for the commit record to be
replicated, but if we our transaction didn't write any other WAL records,
that's not required because we don't even flush the WAL locally to disk in
that case. This lead to long waits when committing a transaction that only
modified a temporary table. Bug spotted by Thom Brown.
Install plpgsql.h to to include/server at "make install".
The header file is needed by any module that wants to use the PL/pgSQL
instrumentation plugin interface. Most notably, the pldebugger plugin needs
this. With this patch, it can be built using pgxs, without having the full
server source tree available.
Add missing descriptions about '--timeout' and '--mode' to help
message. They are already implemented in the source code.
Suggestions about the message formatting from Tom Lane.
Robert Haas [Sat, 14 Apr 2012 12:04:11 +0000 (08:04 -0400)]
pg_size_pretty(numeric)
The output of the new pg_xlog_location_diff function is of type numeric,
since it could theoretically overflow an int8 due to signedness; this
provides a convenient way to format such values.
Peter Eisentraut [Sat, 14 Apr 2012 06:29:54 +0000 (09:29 +0300)]
Update contrib/README
Remove lots of outdated information that is duplicated by the
better-maintained SGML documentation. In particular, remove the
outdated listing of contrib modules. Update the installation
instructions to mention CREATE EXTENSION, but don't go into too much
detail.
Tom Lane [Fri, 13 Apr 2012 20:03:16 +0000 (16:03 -0400)]
Remove the "last ditch" code path in join_search_one_level().
So far as I can tell, it is no longer possible for this heuristic to do
anything useful, because the new weaker definition of
have_relevant_joinclause means that any relation with a joinclause must be
considered joinable to at least one other relation. It would still be
possible for the code block to be entered, for example if there are join
order restrictions that prevent any join of the current level from being
formed; but in that case it's just a waste of cycles to attempt to form
cartesian joins, since the restrictions will still apply.
Furthermore, IMO the existence of this code path can mask bugs elsewhere;
we would have noticed the problem with cartesian joins a lot sooner if
this code hadn't compensated for it in the simplest case.
Accordingly, let's remove it and see what happens. I'm committing this
separately from the prerequisite changes in have_relevant_joinclause,
just to make the question easier to revisit if there is some fault in
my logic.
Tom Lane [Fri, 13 Apr 2012 19:32:34 +0000 (15:32 -0400)]
Weaken the planner's tests for relevant joinclauses.
We should be willing to cross-join two small relations if that allows us
to use an inner indexscan on a large relation (that is, the potential
indexqual for the large table requires both smaller relations). This
worked in simple cases but fell apart as soon as there was a join clause
to a fourth relation, because the existence of any two-relation join clause
caused the planner to not consider clauseless joins between other base
relations. The added regression test shows an example case adapted from
a recent complaint from Benoit Delbosc.
Adjust have_relevant_joinclause, have_relevant_eclass_joinclause, and
has_relevant_eclass_joinclause to consider that a join clause mentioning
three or more relations is sufficient grounds for joining any subset of
those relations, even if we have to do so via a cartesian join. Since such
clauses are relatively uncommon, this shouldn't affect planning speed on
typical queries; in fact it should help a bit, because the latter two
functions in particular get significantly simpler.
Although this is arguably a bug fix, I'm not going to risk back-patching
it, since it might have currently-unforeseen consequences.
Peter Eisentraut [Fri, 13 Apr 2012 18:36:59 +0000 (21:36 +0300)]
Rename bytea_agg to string_agg and add delimiter argument
Per mailing list discussion, we would like to keep the bytea functions
parallel to the text functions, so rename bytea_agg to string_agg,
which already exists for text.
Also, to satisfy the rule that we don't want aggregate functions of
the same name with a different number of arguments, add a delimiter
argument, just like string_agg for text already has.
Tom Lane [Thu, 12 Apr 2012 00:24:17 +0000 (20:24 -0400)]
Fix cost estimation for indexscan filter conditions.
cost_index's method for estimating per-tuple costs of evaluating filter
conditions (a/k/a qpquals) was completely wrong in the presence of derived
indexable conditions, such as range conditions derived from a LIKE clause.
This was largely masked in common cases as a result of all simple operator
clauses having about the same costs, but it could show up in a big way when
dealing with functional indexes containing expensive functions, as seen for
example in bug #6579 from Istvan Endredy. Rejigger the calculation to give
sane answers when the indexquals aren't a subset of the baserestrictinfo
list. As a side benefit, we now do the calculation properly for cases
involving join clauses (ie, parameterized indexscans), which we always
overestimated before.
There are still cases where this is an oversimplification, such as clauses
that can be dropped because they are implied by a partial index's
predicate. But we've never accounted for that in cost estimates before,
and I'm not convinced it's worth the cycles to try to do so.
Tom Lane [Wed, 11 Apr 2012 15:29:22 +0000 (11:29 -0400)]
Silently ignore any nonexistent schemas that are listed in search_path.
Previously we attempted to throw an error or at least warning for missing
schemas, but this was done inconsistently because of implementation
restrictions (in many cases, GUC settings are applied outside transactions
so that we can't do system catalog lookups). Furthermore, there were
exceptions to the rule even in the beginning, and we'd been poking more
and more holes in it as time went on, because it turns out that there are
lots of use-cases for having some irrelevant items in a common search_path
value. It seems better to just adopt a philosophy similar to what's always
been done with Unix PATH settings, wherein nonexistent or unreadable
directories are silently ignored.
This commit also fixes the documentation to point out that schemas for
which the user lacks USAGE privilege are silently ignored. That's always
been true but was previously not documented.
This is mostly in response to Robert Haas' complaint that 9.1 started to
throw errors or warnings for missing schemas in cases where prior releases
had not. We won't adopt such a significant behavioral change in a back
branch, so something different will be needed in 9.1.
Accept postgres:// URIs in libpq connection functions
postgres:// URIs are an attempt to "stop the bleeding" in this general
area that has been said to occur due to external projects adopting their
own syntaxes. The syntaxes supported by this patch:
should be enough to cover most interesting cases without having to
resort to "param=value" pairs, but those are provided for the cases that
need them regardless.
libpq documentation has been shuffled around a bit, to avoid stuffing
all the format details into the PQconnectdbParams description, which was
already a bit overwhelming. The list of keywords has moved to its own
subsection, and the details on the URI format live in another subsection.
This includes a simple test program, as requested in discussion, to
ensure that interesting corner cases continue to work appropriately in
the future.
Author: Alexander Shulgin
Some tweaking by Álvaro Herrera, Greg Smith, Daniel Farina, Peter Eisentraut
Reviewed by Robert Haas, Alexey Klyukin (offlist), Heikki Linnakangas,
Marko Kreen, and others
Oh, it also supports postgresql:// but that's probably just an accident.
Tom Lane [Wed, 11 Apr 2012 01:42:46 +0000 (21:42 -0400)]
Make pg_tablespace_location(0) return the database's default tablespace.
This definition is convenient when applying the function to the
reltablespace column of pg_class, since that's what zero means there;
and it doesn't interfere with any other plausible use of the function.
Per gripe from Bruce Momjian.
Bruce Momjian [Tue, 10 Apr 2012 23:57:14 +0000 (19:57 -0400)]
Fix pg_upgrade to properly upgrade a table that is stored in the cluster
default tablespace, but part of a database that is in a user-defined
tablespace. Caused "file not found" error during upgrade.
Tom Lane [Tue, 10 Apr 2012 16:04:42 +0000 (12:04 -0400)]
Measure epoch of timestamp-without-time-zone from local not UTC midnight.
This patch reverts commit 191ef2b407f065544ceed5700e42400857d9270f
and thereby restores the pre-7.3 behavior of EXTRACT(EPOCH FROM
timestamp-without-tz). Per discussion, the more recent behavior was
misguided on a couple of grounds: it makes it hard to get a
non-timezone-aware epoch value for a timestamp, and it makes this one
case dependent on the value of the timezone GUC, which is incompatible
with having timestamp_part() labeled as immutable.
The other behavior is still available (in all releases) by explicitly
casting the timestamp to timestamp with time zone before applying EXTRACT.
This will need to be called out as an incompatible change in the 9.2
release notes. Although having mutable behavior in a function marked
immutable is clearly a bug, we're not going to back-patch such a change.
Tom Lane [Tue, 10 Apr 2012 00:49:01 +0000 (20:49 -0400)]
Adjust various references to GEQO being non-deterministic.
It's still non-deterministic in some sense ... but given fixed settings
and identical planning problems, it will now always choose the same plan,
so we probably shouldn't tar it with that brush. Per bug #6565 from
Guillaume Cottenceau. Back-patch to 9.0 where the behavior was fixed.
Tom Lane [Mon, 9 Apr 2012 15:58:24 +0000 (11:58 -0400)]
Fix an Assert that turns out to be reachable after all.
estimate_num_groups() gets unhappy with
create table empty();
select * from empty except select * from empty e2;
I can't see any actual use-case for such a query (and the table is illegal
per SQL spec), but it seems like a good idea that it not cause an assert
failure.
Tom Lane [Mon, 9 Apr 2012 15:41:54 +0000 (11:41 -0400)]
Don't bother copying empty support arrays in a zero-column MergeJoin.
The case could not arise when this code was originally written, but it can
now (since we made zero-column MergeJoins work for the benefit of FULL JOIN
ON TRUE). I don't think there is any actual bug here, but we might as well
treat it consistently with other uses of COPY_POINTER_FIELD(). Per comment
from Ashutosh Bapat.
Tom Lane [Mon, 9 Apr 2012 15:16:04 +0000 (11:16 -0400)]
Save a few cycles while creating "sticky" entries in pg_stat_statements.
There's no need to sit there and increment the stats when we know all the
increments would be zero anyway. The actual additions might not be very
expensive, but skipping acquisition of the spinlock seems like a good
thing. Pushing the logic about initialization of the usage count down into
entry_alloc() allows us to do that while making the code actually simpler,
not more complex. Expansion on a suggestion by Peter Geoghegan.
Tom Lane [Sun, 8 Apr 2012 19:49:47 +0000 (15:49 -0400)]
Improve management of "sticky" entries in contrib/pg_stat_statements.
This patch addresses a deficiency in the previous pg_stat_statements patch.
We want to give sticky entries an initial "usage" factor high enough that
they probably will stick around until their query is completed. However,
if the query never completes (eg it gets an error during execution), the
entry shouldn't persist indefinitely. Manage this by starting out with
a usage setting equal to the (approximate) median usage value within the
whole hashtable, but decaying the value much more aggressively than we
do for normal entries.
set_stack_base() no longer needs to be called in PostgresMain.
This was a thinko in previous commit. Now that stack base pointer is now set
in PostmasterMain and SubPostmasterMain, it doesn't need to be set in
PostgresMain anymore.
Do stack-depth checking in all postmaster children.
We used to only initialize the stack base pointer when starting up a regular
backend, not in other processes. In particular, autovacuum workers can run
arbitrary user code, and without stack-depth checking, infinite recursion
in e.g an index expression will bring down the whole cluster.
The comment about PL/Java using set_stack_base() is not yet true. As the
code stands, PL/java still modifies the stack_base_ptr variable directly.
However, it's been discussed in the PL/Java mailing list that it should be
changed to use the function, because PL/Java is currently oblivious to the
register stack used on Itanium. There's another issues with PL/Java, namely
that the stack base pointer it sets is not really the base of the stack, it
could be something close to the bottom of the stack. That's a separate issue
that might need some further changes to this code, but that's a different
story.