Peter Eisentraut [Wed, 16 Dec 2009 23:05:00 +0000 (23:05 +0000)]
Don't unblock SIGQUIT in the SIGQUIT handler
This was possibly linked to a deadlock-like situation in glibc syslog code
invoked by the ereport call in quickdie(). In any case, a signal handler
should not unblock its own signal unless there is a specific reason to.
Tom Lane [Wed, 16 Dec 2009 22:24:13 +0000 (22:24 +0000)]
Avoid a premature coercion failure in transformSetOperationTree() when
presented with an UNKNOWN-type Var, which can happen in cases where an
unknown literal appeared in a subquery. While many such cases will fail
later on anyway in the planner, there are some cases where the planner is
able to flatten the query and replace the Var by the constant before it has
to coerce the union column to the final type. I had added this check in 8.4
to provide earlier/better error detection, but it causes a regression for
some cases that worked OK before. Fix by not making the check if the input
node is UNKNOWN type and not a Const or Param. If it isn't going to work,
it will fail anyway at plan time, with the only real loss being inability to
provide an error cursor. Per gripe from Britt Piehler.
In passing, rename a couple of variables to remove confusion from an
inner scope masking the same variable names in an outer scope.
Robert Haas [Wed, 16 Dec 2009 22:16:16 +0000 (22:16 +0000)]
Several fixes for EXPLAIN (FORMAT YAML), plus one for EXPLAIN (FORMAT JSON).
ExplainSeparatePlans() was busted for both JSON and YAML output - the present
code is a holdover from the original version of my machine-readable explain
patch, which didn't have the grouping_stack machinery. Also, fix an odd
distribution of labor between ExplainBeginGroup() and ExplainYAMLLineStarting()
when marking lists with "- ", with each providing one character. This broke
the output format for multi-query statements. Also, fix ExplainDummyGroup()
for the YAML output format.
Along the way, make the YAML format use escape_yaml() in situations where the
JSON format uses escape_json(). Right now, it doesn't matter because all the
values are known not to need escaping, but it seems safer this way. Finally,
I added some comments to better explain what the YAML output format is doing.
Greg Sabino Mullane reported the issues with multi-query statements.
Analysis and remaining cleanups by me.
Michael Meskes [Wed, 16 Dec 2009 10:15:07 +0000 (10:15 +0000)]
Fixed auto-prepare to not try preparing statements that are not preparable. Bug
found and solved by Boszormenyi Zoltan <zb@cybertec.at>, some small adjustments
by me.
Peter Eisentraut [Tue, 15 Dec 2009 22:59:55 +0000 (22:59 +0000)]
Python 3 support in PL/Python
Behaves more or less unchanged compared to Python 2, but the new language
variant is called plpython3u. Documentation describing the naming scheme
is included.
Tom Lane [Tue, 15 Dec 2009 20:37:17 +0000 (20:37 +0000)]
Avoid unnecessary copying of source string when generating a cloned TParser.
For long source strings the copying results in O(N^2) behavior, and the
multiplier can be significant if wide-char conversion is involved.
Tom Lane [Tue, 15 Dec 2009 17:57:48 +0000 (17:57 +0000)]
Support ORDER BY within aggregate function calls, at long last providing a
non-kluge method for controlling the order in which values are fed to an
aggregate function. At the same time eliminate the old implementation
restriction that DISTINCT was only supported for single-argument aggregates.
Possibly release-notable behavioral change: formerly, agg(DISTINCT x)
dropped null values of x unconditionally. Now, it does so only if the
agg transition function is strict; otherwise nulls are treated as DISTINCT
normally would, ie, you get one copy.
Robert Haas [Tue, 15 Dec 2009 04:57:48 +0000 (04:57 +0000)]
Add an EXPLAIN (BUFFERS) option to show buffer-usage statistics.
This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.
Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.
Tom Lane [Mon, 14 Dec 2009 02:15:54 +0000 (02:15 +0000)]
Fix a bug introduced when set-returning SQL functions were made inline-able:
we have to cope with the possibility that the declared result rowtype contains
dropped columns. This fails in 8.4, as per bug #5240.
While at it, be more paranoid about inserting binary coercions when inlining.
The pre-8.4 code did not really need to worry about that because it could not
inline at all in any case where an added coercion could change the behavior
of the function's statement. However, when inlining a SRF we allow sorting,
grouping, and set-ops such as UNION. In these cases, modifying one of the
targetlist entries that the sort/group/setop depends on could conceivably
change the behavior of the function's statement --- so don't inline when
such a case applies.
Itagaki Takahiro [Mon, 14 Dec 2009 00:39:11 +0000 (00:39 +0000)]
Additional fixes for large object access control.
Use pg_largeobject_metadata.oid instead of pg_largeobject.loid
to enumerate existing large objects in pg_dump, pg_restore, and
contrib modules.
Magnus Hagander [Sat, 12 Dec 2009 21:35:21 +0000 (21:35 +0000)]
Allow LDAP authentication to operate in search+bind mode, meaning it
does a search for the user in the directory first, and then binds with
the DN found for this user.
This allows for LDAP logins in scenarios where the DN of the user cannot
be determined simply by prefix and suffix, such as the case where different
users are located in different containers.
The old way of authentication can be significantly faster, so it's kept
as an option.
Tom Lane [Sat, 12 Dec 2009 19:24:35 +0000 (19:24 +0000)]
Fix integer-to-bit-string conversions to handle the first fractional byte
correctly when the output bit width is wider than the given integer by
something other than a multiple of 8 bits.
This has been wrong since I first wrote that code for 8.0 :-(. Kudos to
Roman Kononov for being the first to notice, though I didn't use his
patch. Per bug #5237.
Robert Haas [Sat, 12 Dec 2009 00:35:34 +0000 (00:35 +0000)]
Export ExplainBeginOutput() and ExplainEndOutput() for auto_explain.
Without these functions, anyone outside of explain.c can't actually use
ExplainPrintPlan, because the ExplainState won't be initialized properly.
The user-visible result of this was a crash when using auto_explain with
the JSON output format.
Report by Euler Taveira de Oliveira. Analysis by Tom Lane. Patch by me.
Tom Lane [Fri, 11 Dec 2009 21:50:06 +0000 (21:50 +0000)]
Arrange to generate different random sequences in the different child
processes of a pgbench run, when we are using -j > 1 and are emulating
threads via fork(). Otherwise the children all inherit the same random
sequence state and produce the same random-number sequence.
In the threaded case the different threads will share one RNG state, so
they will produce different subsets of one sequence, which is maybe more
correlated than a purist would like but will not be "the same". So we
leave that case alone.
First noticed by Takahiro Itagaki, and is also part of the explanation
for the pgbench misbehavior recently reported by Jaime Casanova.
Tom Lane [Fri, 11 Dec 2009 18:14:43 +0000 (18:14 +0000)]
Ensure that the result tuple of an EvalPlanQual cycle gets materialized
before we zap the input tuple. Otherwise, pass-by-reference columns of
the result slot are likely to contain just references to the input
tuple, leading to big trouble if the pfree'd space is reused. Per
trouble report from Jaime Casanova. This is a new bug in the recent
rewrite of EvalPlanQual, so nothing to back-patch.
Peter Eisentraut [Thu, 10 Dec 2009 06:32:28 +0000 (06:32 +0000)]
Add init[db] option to pg_ctl
pg_ctl gets a new mode that runs initdb. Adjust the documentation a bit to
not assume that initdb is the only way to run database cluster initialization.
But don't replace initdb as the canonical way.
Tom Lane [Wed, 9 Dec 2009 21:57:51 +0000 (21:57 +0000)]
Prevent indirect security attacks via changing session-local state within
an allegedly immutable index function. It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user. However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.
The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue. GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation. Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement. (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)
Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.
Magnus Hagander [Wed, 9 Dec 2009 06:37:06 +0000 (06:37 +0000)]
Reject certificates with embedded NULLs in the commonName field. This stops
attacks where an attacker would put <attack>\0<propername> in the field and
trick the validation code that the certificate was for <attack>.
This is a very low risk attack since it reuqires the attacker to trick the
CA into issuing a certificate with an incorrect field, and the common
PostgreSQL deployments are with private CAs, and not external ones. Also,
default mode in 8.4 does not do any name validation, and is thus also not
vulnerable - but the higher security modes are.
Backpatch all the way. Even though versions 8.3.x and before didn't have
certificate name validation support, they still exposed this field for
the user to perform the validation in the application code, and there
is no way to detect this problem through that API.
Tom Lane [Wed, 9 Dec 2009 00:35:32 +0000 (00:35 +0000)]
Update time zone data files to tzdata release 2009s: DST law changes in
Antarctica, Argentina, Bangladesh, Fiji, Novokuznetsk, Pakistan, Palestine,
Samoa, Syria. Also historical corrections for Hong Kong.
Tom Lane [Mon, 7 Dec 2009 05:22:23 +0000 (05:22 +0000)]
Add exclusion constraints, which generalize the concept of uniqueness to
support any indexable commutative operator, not just equality. Two rows
violate the exclusion constraint if "row1.col OP row2.col" is TRUE for
each of the columns in the constraint.
Instead of expensive cross joins to resolve the ACL, add table-returning
function aclexplode() that expands the ACL into a useful form, and join
against that.
Also, implement the role_*_grants views as a thin layer over the respective
*_privileges views instead of essentially repeating the same code twice.
Fix bug in temporary file management with subtransactions. A cursor opened
in a subtransaction stays open even if the subtransaction is aborted, so
any temporary files related to it must stay alive as well. With the patch,
we use ResourceOwners to track open temporary files and don't automatically
close them at subtransaction end (though in the normal case temporary files
are registered with the subtransaction resource owner and will therefore be
closed).
At end of top transaction, we still check that there's no temporary files
marked as close-at-end-of-transaction open, but that's now just a debugging
cross-check as the resource owner cleanup should've closed them already.
Tom Lane [Wed, 2 Dec 2009 04:54:10 +0000 (04:54 +0000)]
Mark application_name as GUC_REPORT so that the value will be reported back
to the client by the server. This might seem pretty pointless but apparently
it will help pgbouncer, and perhaps other connection poolers. Anyway it's
practically free to do so for the normal use-case where appname is only set
in the startup packet --- we're just adding a few more bytes to the initial
ParameterStatus response packet. Per comments from Marko Kreen.
Tom Lane [Wed, 2 Dec 2009 04:38:35 +0000 (04:38 +0000)]
Instead of sending application_name as a SET command after the connection
is made, include it in the startup-packet options. This makes it work more
like every other libpq connection option, in particular it now has the same
response to RESET ALL as the rest. This also saves one network round trip
for new applications using application_name. The cost is that if the server
is pre-8.5, it'll reject the startup packet altogether, forcing us to retry
the entire connection cycle. But on balance we shouldn't be optimizing that
case in preference to the behavior with a new server, especially when doing
so creates visible behavioral oddities. Per discussion.
Tom Lane [Tue, 1 Dec 2009 21:00:24 +0000 (21:00 +0000)]
Teach the regular expression functions to do case-insensitive matching and
locale-dependent character classification properly when the database encoding
is UTF8.
The previous coding worked okay in single-byte encodings, or in any case for
ASCII characters, but failed entirely on multibyte characters. The fix
assumes that the <wctype.h> functions use Unicode code points as the wchar
representation for Unicode, ie, wchar matches pg_wchar.
This is only a partial solution, since we're still stupid about non-ASCII
characters in multibyte encodings other than UTF8. The practical effect
of that is limited, however, since those cases are generally Far Eastern
glyphs for which concepts like case-folding don't apply anyway. Certainly
all or nearly all of the field reports of problems have been about UTF8.
A more general solution would require switching to the platform's wchar
representation for all regex operations; which is possible but would have
substantial disadvantages. Let's try this and see if it's sufficient in
practice.
Peter Eisentraut [Mon, 30 Nov 2009 15:49:35 +0000 (15:49 +0000)]
In SRF example, move oldcontext variable definition into the FIRSTCALL
branch, which is how most actual code is actually structured. Also fix
slight whitespace misalignment.
Tom Lane [Sun, 29 Nov 2009 21:02:16 +0000 (21:02 +0000)]
Fix session-lifespan memory leak when a plperl function is redefined:
we have to tell Perl it can release its compiled copy of the function
text. Noted by Alexey Klyukin.
Back-patch to 8.2 --- the problem exists further back, but this patch
won't work without modification, and it's probably not worth the trouble.
Tom Lane [Sun, 29 Nov 2009 18:53:54 +0000 (18:53 +0000)]
Add some opr_sanity checks that the lengths of the various argument-info
arrays in a pg_proc entry match. Seems like an easy mistake to make when
manually adjusting these values in a pg_proc.h entry.
Tom Lane [Sun, 29 Nov 2009 18:14:32 +0000 (18:14 +0000)]
Make pg_stat_activity.application_name visible to all users, rather than
being hidden when current_query is. Relocate it to a column position
more consistent with that behavior. Per discussion.
Tom Lane [Sat, 28 Nov 2009 00:46:19 +0000 (00:46 +0000)]
Eliminate a lot of list-management overhead within join_search_one_level
by adding a requirement that build_join_rel add new join RelOptInfos to the
appropriate list immediately at creation. Per report from Robert Haas,
the list_concat_unique_ptr() calls that this change eliminates were taking
the lion's share of the runtime in larger join problems. This doesn't do
anything to fix the fundamental combinatorial explosion in large join
problems, but it should push out the threshold of pain a bit further.
Note: because this changes the order in which joinrel lists are built,
it might result in changes in selected plans in cases where different
alternatives have exactly the same costs. There is one example in the
regression tests.
Michael Meskes [Fri, 27 Nov 2009 10:00:40 +0000 (10:00 +0000)]
Added script to check if all rule re-definition in ecpg.addons are indeed used
in the build process. If not the build process will stop with an error message.
Tom Lane [Wed, 25 Nov 2009 20:26:31 +0000 (20:26 +0000)]
Simplify psql's new linestyle behavior to default to linestyle=ascii all
the time, rather than hoping we can tell whether the terminal supports
UTF8 characters. Per discussion.
Tom Lane [Mon, 23 Nov 2009 16:02:24 +0000 (16:02 +0000)]
Use diff's -w switch only on Windows, to avoid problems with inconsistent
newline representations. Per buildfarm results and subsequent discussion.
Sync up a couple of other places that had their own policies.
Fix an old bug in multixact and two-phase commit. Prepared transactions can
be part of multixacts, so allocate a slot for each prepared transaction in
the "oldest member" array in multixact.c. On PREPARE TRANSACTION, transfer
the oldest member value from the current backends slot to the prepared xact
slot. Also save and recover the value from the 2pc state file.
The symptom of the bug was that after a transaction prepared, a shared lock
still held by the prepared transaction was sometimes ignored by other
transactions.
Fix back to 8.1, where both 2PC and multixact were introduced.
Tom Lane [Sun, 22 Nov 2009 22:06:30 +0000 (22:06 +0000)]
Assorted wordsmithing on the documentation of \pset --- try to make it
a bit more consistent and less obviously written by different people at
different times.
Tom Lane [Sun, 22 Nov 2009 17:54:23 +0000 (17:54 +0000)]
Remove -w (--ignore-all-space) option from pg_regress's diff calls.
We have used -w for a long time as a means of reducing the reported diff
volume when one element of a result table isn't of the expected width.
However, most of the time the results just pass anyway, so this isn't as
important as it once was. Meanwhile, the risk of missing potentially
significant deviations has gone up, particularly with psql's ability to
report error cursor positions. So, let's switch over to space-sensitive
comparisons. Per my proposal of yesterday.
(All the expected files that I can test here seem to be ready for this
already, but we'll see what the buildfarm thinks about others.)
Tom Lane [Sun, 22 Nov 2009 05:20:41 +0000 (05:20 +0000)]
Improve psql's tabular display of wrapped-around data by inserting markers
in the formerly-always-blank columns just to left and right of the data.
Different marking is used for a line break caused by a newline in the data
than for a straight wraparound. A newline break is signaled by a "+" in the
right margin column in ASCII mode, or a carriage return arrow in UNICODE mode.
Wraparound is signaled by a dot in the right margin as well as the following
left margin in ASCII mode, or an ellipsis symbol in the same places in UNICODE
mode. "\pset linestyle old-ascii" is added to make the previous behavior
available if anyone really wants it.
In passing, this commit also cleans up a few regression test files that
had unintended spacing differences from the current actual output.
Roger Leigh, reviewed by Gabrielle Roth and other members of PDXPUG.
Tom Lane [Sat, 21 Nov 2009 05:44:05 +0000 (05:44 +0000)]
Refactor ecpg grammar so that it uses the core grammar's unreserved_keyword
list, minus a few specific words that have to be treated specially. This
replaces a hard-wired list of keywords that would have needed manual
maintenance, and was not getting it. The 8.4 coding was already missing
these words, causing ecpg to incorrectly treat them as reserved words:
CALLED, CATALOG, DEFINER, ENUM, FOLLOWING, INVOKER, OPTIONS, PARTITION,
PRECEDING, RANGE, SECURITY, SERVER, UNBOUNDED, WRAPPER. In HEAD we were
additionally missing COMMENTS, FUNCTIONS, SEQUENCES, TABLES.
Per gripe from Bosco Rama.
Tom Lane [Fri, 20 Nov 2009 20:38:12 +0000 (20:38 +0000)]
Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be
checked to determine whether the trigger should be fired.
For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.
Tom Lane [Thu, 19 Nov 2009 02:45:33 +0000 (02:45 +0000)]
Fix memory leak in syslogger: logfile_rotate() would leak a copy of the
output filename if CSV logging was enabled and only one of the two possible
output files got rotated during a particular call (which would, in fact,
typically be the case during a size-based rotation). This would amount to
about MAXPGPATH (1KB) per rotation, and it's been there since the CSV
code was put in, so it's surprising that nobody noticed it before.
Per bug #5196 from Thomas Poindessous.
Tom Lane [Wed, 18 Nov 2009 21:57:56 +0000 (21:57 +0000)]
Add a hook to CREATE/ALTER ROLE to allow an external module to check the
strength of database passwords, and create a sample implementation of
such a hook as a new contrib module "passwordcheck".
Tom Lane [Mon, 16 Nov 2009 21:32:07 +0000 (21:32 +0000)]
Provide a parenthesized-options syntax for VACUUM, analogous to that recently
adopted for EXPLAIN. This will allow additional options to be implemented
in future without having to make them fully-reserved keywords. The old syntax
remains available for existing options, however.
Tom Lane [Mon, 16 Nov 2009 18:04:40 +0000 (18:04 +0000)]
While doing the final setrefs.c pass over a plan tree, try to match up
non-Var sort/group expressions using ressortgroupref labels instead of
depending entirely on equal()-ity of the upper node's tlist expressions
to the lower node's. This avoids emitting the wrong outputs in cases
where there are textually identical volatile sort/group expressions,
as for example
select distinct random(),random() from generate_series(1,10);
Per report from Andrew Gierth.
Backpatch to 8.4. Arguably this is wrong all the way back, but the only known
case where there's an observable problem is when using hash aggregation to
implement DISTINCT, which is new as of 8.4. So for the moment I'll refrain
from backpatching further.
Tom Lane [Sun, 15 Nov 2009 02:45:35 +0000 (02:45 +0000)]
Improve planning of Materialize nodes inserted atop the inner input of a
mergejoin to shield it from doing mark/restore and refetches. Put an explicit
flag in MergePath so we can centralize the logic that knows about this,
and add costing logic that considers using Materialize even when it's not
forced by the previously-existing considerations. This is in response to
a discussion back in August that suggested that materializing an inner
indexscan can be helpful when the refetch percentage is high enough.