Andrew Dunstan [Sat, 17 Mar 2012 21:24:14 +0000 (17:24 -0400)]
Honor inputdir and outputdir when converting regression files.
When converting source files, pg_regress' inputdir and outputdir options were
ignored when computing the locations of the destination files. In consequence,
these options were effectively unusable when the regression inputs need to
be adjusted by pg_regress. This patch makes pg_regress put the converted files
in the same place that these options specify non-converted input or results
files are to be found. Backpatched to all live branches.
Tom Lane [Mon, 5 Mar 2012 19:09:10 +0000 (14:09 -0500)]
Improve documentation around logging_collector and use of stderr.
In backup.sgml, point out that you need to be using the logging collector
if you want to log messages from a failing archive_command script. (This
is an oversimplification, in that it will work without the collector as
long as you're not sending postmaster stderr to /dev/null; but it seems
like a good idea to encourage use of the collector to avoid problems
with multiple processes concurrently scribbling on one file.)
In config.sgml, do some wordsmithing of logging_collector discussion.
Tom Lane [Thu, 23 Feb 2012 20:53:34 +0000 (15:53 -0500)]
Convert newlines to spaces in names written in pg_dump comments.
pg_dump was incautious about sanitizing object names that are emitted
within SQL comments in its output script. A name containing a newline
would at least render the script syntactically incorrect. Maliciously
crafted object names could present a SQL injection risk when the script
is reloaded.
Reported by Heikki Linnakangas, patch by Robert Haas
Tom Lane [Thu, 23 Feb 2012 20:39:20 +0000 (15:39 -0500)]
Require execute permission on the trigger function for CREATE TRIGGER.
This check was overlooked when we added function execute permissions to the
system years ago. For an ordinary trigger function it's not a big deal,
since trigger functions execute with the permissions of the table owner,
so they couldn't do anything the user issuing the CREATE TRIGGER couldn't
have done anyway. However, if a trigger function is SECURITY DEFINER,
that is not the case. The lack of checking would allow another user to
install it on his own table and then invoke it with, essentially, forged
input data; which the trigger function is unlikely to realize, so it might
do something undesirable, for instance insert false entries in an audit log
table.
Tom Lane [Tue, 21 Feb 2012 20:04:01 +0000 (15:04 -0500)]
Don't clear btpo_cycleid during _bt_vacuum_one_page.
When "vacuuming" a single btree page by removing LP_DEAD tuples, we are not
actually within a vacuum operation, but rather in an ordinary insertion
process that could well be running concurrently with a vacuum. So clearing
the cycleid is incorrect, and could cause the concurrent vacuum to miss
removing tuples that it needs to remove. This is a longstanding bug
introduced by commit e6284649b9e30372b3990107a082bc7520325676 of
2006-07-25. I believe it explains Maxim Boguk's recent report of index
corruption, and probably some other previously unexplained reports.
In 9.0 and up this is a one-line fix; before that we need to introduce a
flag to tell _bt_delitems what to do.
Tom Lane [Mon, 20 Feb 2012 05:52:59 +0000 (00:52 -0500)]
Fix regex back-references that are directly quantified with *.
The syntax "\n*", that is a backref with a * quantifier directly applied
to it, has never worked correctly in Spencer's library. This has been an
open bug in the Tcl bug tracker since 2005:
https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894
The core of the problem is in parseqatom(), which first changes "\n*" to
"\n+|" and then applies repeat() to the NFA representing the backref atom.
repeat() thinks that any arc leading into its "rp" argument is part of the
sub-NFA to be repeated. Unfortunately, since parseqatom() already created
the arc that was intended to represent the empty bypass around "\n+", this
arc gets moved too, so that it now leads into the state loop created by
repeat(). Thus, what was supposed to be an "empty" bypass gets turned into
something that represents zero or more repetitions of the NFA representing
the backref atom. In the original example, in place of
^([bc])\1*$
we now have something that acts like
^([bc])(\1+|[bc]*)$
At runtime, the branch involving the actual backref fails, as it's supposed
to, but then the other branch succeeds anyway.
We could no doubt fix this by some rearrangement of the operations in
parseqatom(), but that code is plenty ugly already, and what's more the
whole business of converting "x*" to "x+|" probably needs to go away to fix
another problem I'll mention in a moment. Instead, this patch suppresses
the *-conversion when the target is a simple backref atom, leaving the case
of m == 0 to be handled at runtime. This makes the patch in regcomp.c a
one-liner, at the cost of having to tweak cbrdissect() a little. In the
event I went a bit further than that and rewrote cbrdissect() to check all
the string-length-related conditions before it starts comparing characters.
It seems a bit stupid to possibly iterate through many copies of an
n-character backreference, only to fail at the end because the target
string's length isn't a multiple of n --- we could have found that out
before starting. The existing coding could only be a win if integer
division is hugely expensive compared to character comparison, but I don't
know of any modern machine where that might be true.
This does not fix all the problems with quantified back-references. In
particular, the code is still broken for back-references that appear within
a larger expression that is quantified (so that direct insertion of the
quantification limits into the BACKREF node doesn't apply). I think fixing
that will take some major surgery on the NFA code, specifically introducing
an explicit iteration node type instead of trying to transform iteration
into concatenation of modified regexps.
Back-patch to all supported branches. In HEAD, also add a regression test
case for this. (It may seem a bit silly to create a regression test file
for just one test case; but I'm expecting that we will soon import a whole
bunch of regex regression tests from Tcl, so might as well create the
infrastructure now.)
Tom Lane [Fri, 17 Feb 2012 01:00:34 +0000 (20:00 -0500)]
Fix longstanding error in contrib/intarray's int[] & int[] operator.
The array intersection code would give wrong results if the first entry of
the correct output array would be "1". (I think only this value could be
at risk, since the previous word would always be a lower-bound entry with
that fixed value.)
Problem spotted by Julien Rouhaud, initial patch by Guillaume Lelarge,
cosmetic improvements by me.
Tom Lane [Sat, 11 Feb 2012 23:06:46 +0000 (18:06 -0500)]
Fix I/O-conversion-related memory leaks in plpgsql.
Datatype I/O functions are allowed to leak memory in CurrentMemoryContext,
since they are generally called in short-lived contexts. However, plpgsql
calls such functions for purposes of type conversion, and was calling them
in its procedure context. Therefore, any leaked memory would not be
recovered until the end of the plpgsql function. If such a conversion
was done within a loop, quite a bit of memory could get consumed. Fix by
calling such functions in the transient "eval_econtext", and adjust other
logic to match. Back-patch to all supported versions.
Tom Lane [Fri, 10 Feb 2012 19:09:42 +0000 (14:09 -0500)]
Fix brain fade in previous pg_dump patch.
In pre-7.3 databases, pg_attribute.attislocal doesn't exist. The easiest
way to make sure the new inheritance logic behaves sanely is to assume it's
TRUE, not FALSE. This will result in printing child columns even when
they're not really needed. We could work harder at trying to reconstruct a
value for attislocal, but there is little evidence that anyone still cares
about dumping from such old versions, so just do the minimum necessary to
have a valid dump.
I had this correct in the original draft of the patch, but for some
unaccountable reason decided it wasn't necessary to change the value.
Testing against an old server shows otherwise...
Tom Lane [Fri, 10 Feb 2012 18:28:31 +0000 (13:28 -0500)]
Fix pg_dump for better handling of inherited columns.
Revise pg_dump's handling of inherited columns, which was last looked at
seriously in 2001, to eliminate several misbehaviors associated with
inherited default expressions and NOT NULL flags. In particular make sure
that a column is printed in a child table's CREATE TABLE command if and
only if it has attislocal = true; the former behavior would sometimes cause
a column to become marked attislocal when it was not so marked in the
source database. Also, stop relying on textual comparison of default
expressions to decide if they're inherited; instead, don't use
default-expression inheritance at all, but just install the default
explicitly at each level of the hierarchy. This fixes the
search-path-related misbehavior recently exhibited by Chester Young, and
also removes some dubious assumptions about the order in which ALTER TABLE
SET DEFAULT commands would be executed.
Tom Lane [Mon, 6 Feb 2012 18:15:04 +0000 (13:15 -0500)]
Avoid problems with OID wraparound during WAL replay.
Fix a longstanding thinko in replay of NEXTOID and checkpoint records: we
tried to advance nextOid only if it was behind the value in the WAL record,
but the comparison would draw the wrong conclusion if OID wraparound had
occurred since the previous value. Better to just unconditionally assign
the new value, since OID assignment shouldn't be happening during replay
anyway.
The consequences of a failure to update nextOid would be pretty minimal,
since we have long had the code set up to obtain another OID and try again
if the generated value is already in use. But in the worst case there
could be significant performance glitches while such loops iterate through
many already-used OIDs before finding a free one.
The odds of a wraparound happening during WAL replay would be small in a
crash-recovery scenario, and the length of any ensuing OID-assignment stall
quite limited anyway. But neither of these statements hold true for a
replication slave that follows a WAL stream for a long period; its behavior
upon going live could be almost unboundedly bad. Hence it seems worth
back-patching this fix into all supported branches.
Accept a non-existent value in "ALTER USER/DATABASE SET ..." command.
When default_text_search_config, default_tablespace, or temp_tablespaces
setting is set per-user or per-database, with an "ALTER USER/DATABASE SET
..." statement, don't throw an error if the text search configuration or
tablespace does not exist. In case of text search configuration, even if
it doesn't exist in the current database, it might exist in another
database, where the setting is intended to have its effect. This behavior
is now the same as search_path's.
Tablespaces are cluster-wide, so the same argument doesn't hold for
tablespaces, but there's a problem with pg_dumpall: it dumps "ALTER USER
SET ..." statements before the "CREATE TABLESPACE" statements. Arguably
that's pg_dumpall's fault - it should dump the statements in such an order
that the tablespace is created first and then the "ALTER USER SET
default_tablespace ..." statements after that - but it seems better to be
consistent with search_path and default_text_search_config anyway. Besides,
you could still create a dump that throws an error, by creating the
tablespace, running "ALTER USER SET default_tablespace", then dropping the
tablespace and running pg_dumpall on that.
Tom Lane [Sat, 28 Jan 2012 04:09:16 +0000 (23:09 -0500)]
Fix error detection in contrib/pgcrypto's encrypt_iv() and decrypt_iv().
Due to oversights, the encrypt_iv() and decrypt_iv() functions failed to
report certain types of invalid-input errors, and would instead return
random garbage values.
Tom Lane [Tue, 10 Jan 2012 00:56:27 +0000 (19:56 -0500)]
Fix one-byte buffer overrun in contrib/test_parser.
The original coding examined the next character before verifying that
there *is* a next character. In the worst case with the input buffer
right up against the end of memory, this would result in a segfault.
Problem spotted by Paul Guyot; this commit extends his patch to fix an
additional case. In addition, make the code a tad more readable by not
overloading the usage of *tlen.
Tom Lane [Sat, 7 Jan 2012 20:39:16 +0000 (15:39 -0500)]
Use __sync_lock_test_and_set() for spinlocks on ARM, if available.
Historically we've used the SWPB instruction for TAS() on ARM, but this
is deprecated and not available on ARMv6 and later. Instead, make use
of a GCC builtin if available. We'll still fall back to SWPB if not,
so as not to break existing ports using older GCC versions.
Eventually we might want to try using __sync_lock_test_and_set() on some
other architectures too, but for now that seems to present only risk and
not reward.
Back-patch to all supported versions, since people might want to use any
of them on more recent ARM chips.
Tom Lane [Fri, 6 Jan 2012 18:04:37 +0000 (13:04 -0500)]
Fix pg_restore's direct-to-database mode for INSERT-style table data.
In commit 6545a901aaf84cb05212bb6a7674059908f527c3, I removed the mini SQL
lexer that was in pg_backup_db.c, thinking that it had no real purpose
beyond separating COPY data from SQL commands, which purpose had been
obsoleted by long-ago fixes in pg_dump's archive file format.
Unfortunately this was in error: that code was also used to identify
command boundaries in INSERT-style table data, which is run together as a
single string in the archive file for better compressibility. As a result,
direct-to-database restores from archive files made with --inserts or
--column-inserts fail in our latest releases, as reported by Dick Visser.
To fix, restore the mini SQL lexer, but simplify it by adjusting the
calling logic so that it's only required to cope with INSERT-style table
data, not arbitrary SQL commands. This allows us to not have to deal with
SQL comments, E'' strings, or dollar-quoted strings, none of which have
ever been emitted by dumpTableData_insert.
Also, fix the lexer to cope with standard-conforming strings, which was the
actual bug that the previous patch was meant to solve.
Back-patch to all supported branches. The previous patch went back to 8.2,
which unfortunately means that the EOL release of 8.2 contains this bug,
but I don't think we're doing another 8.2 release just because of that.
Revert the behavior of inet/cidr functions to not unpack the arguments.
I forgot to change the functions to use the PG_GETARG_INET_PP() macro,
when I changed DatumGetInetP() to unpack the datum, like Datum*P macros
usually do. Also, I screwed up the definition of the PG_GETARG_INET_PP()
macro, and didn't notice because it wasn't used.
This fixes the memory leak when sorting inet values, as reported
by Jochen Erwied and debugged by Andres Freund. Backpatch to 8.3, like
the previous patch that broke it.
Tom Lane [Wed, 30 Nov 2011 05:37:33 +0000 (00:37 -0500)]
Tweak previous patch to ensure edata->filename always gets initialized.
On a platform that isn't supplying __FILE__, previous coding would either
crash or give a stale result for the filename string. Not sure how likely
that is, but the original code catered for it, so let's keep doing so.
Peter Eisentraut [Tue, 29 Nov 2011 20:04:59 +0000 (22:04 +0200)]
Strip file names reported in error messages in vpath builds
In vpath builds, the __FILE__ macro that is used in verbose error
reports contains the full absolute file name, which makes the error
messages excessively verbose. So keep only the base name, thus
matching the behavior of non-vpath builds.
Tom Lane [Sat, 19 Nov 2011 05:35:29 +0000 (00:35 -0500)]
Avoid floating-point underflow while tracking buffer allocation rate.
When the system is idle for awhile after activity, the "smoothed_alloc"
state variable in BgBufferSync converges slowly to zero. With standard
IEEE float arithmetic this results in several iterations with denormalized
values, which causes kernel traps and annoying log messages on some
poorly-designed platforms. There's no real need to track such small values
of smoothed_alloc, so we can prevent the kernel traps by forcing it to zero
as soon as it's too small to be interesting for our purposes. This issue
is purely cosmetic, since the iterations don't happen fast enough for the
kernel traps to pose any meaningful performance problem, but still it seems
worth shutting up the log messages.
The kernel log messages were previously reported by a number of people,
but kudos to Greg Matthews for tracking down exactly where they were coming
from.
Make DatumGetInetP() unpack inet datums with a 1-byte header, and add
a new macro, DatumGetInetPP(), that does not. This brings these macros
in line with other DatumGet*P() macros.
Backpatch to 8.3, where 1-byte header varlenas were introduced.
Tom Lane [Sat, 5 Nov 2011 03:23:33 +0000 (23:23 -0400)]
Don't assume that a tuple's header size is unchanged during toasting.
This assumption can be wrong when the toaster is passed a raw on-disk
tuple, because the tuple might pre-date an ALTER TABLE ADD COLUMN operation
that added columns without rewriting the table. In such a case the tuple's
natts value is smaller than what we expect from the tuple descriptor, and
so its t_hoff value could be smaller too. In fact, the tuple might not
have a null bitmap at all, and yet our current opinion of it is that it
contains some trailing nulls.
In such a situation, toast_insert_or_update did the wrong thing, because
to save a few lines of code it would use the old t_hoff value as the offset
where heap_fill_tuple should start filling data. This did not leave enough
room for the new nulls bitmap, with the result that the first few bytes of
data could be overwritten with null flag bits, as in a recent report from
Hubert Depesz Lubaczewski.
The particular case reported requires ALTER TABLE ADD COLUMN followed by
CREATE TABLE AS SELECT * FROM ... or INSERT ... SELECT * FROM ..., and
further requires that there be some out-of-line toasted fields in one of
the tuples to be copied; else we'll not reach the troublesome code.
The problem can only manifest in this form in 8.4 and later, because
before commit a77eaa6a95009a3441e0d475d1980259d45da072, CREATE TABLE AS or
INSERT/SELECT wouldn't result in raw disk tuples getting passed directly
to heap_insert --- there would always have been at least a junkfilter in
between, and that would reconstitute the tuple header with an up-to-date
t_natts and hence t_hoff. But I'm backpatching the tuptoaster change all
the way anyway, because I'm not convinced there are no older code paths
that present a similar risk.
Tom Lane [Thu, 3 Nov 2011 23:18:10 +0000 (19:18 -0400)]
Fix bogus code in contrib/ tsearch dictionary examples.
Both dict_int and dict_xsyn were blithely assuming that whatever memory
palloc gives back will be pre-zeroed. This would typically work for
just about long enough to run their regression tests, and no longer :-(.
The pre-9.0 code in dict_xsyn was even lamer than that, as it would
happily give back a pointer to the result of palloc(0), encouraging
its caller to access off the end of memory. Again, this would just
barely fail to fail as long as memory contained nothing but zeroes.
Per a report from Rodrigo Hjort that code based on these examples
didn't work reliably.
Tom Lane [Wed, 2 Nov 2011 17:38:21 +0000 (13:38 -0400)]
Revert "Stop btree indexscans upon reaching nulls in either direction."
This reverts commit ff41611ddcce36b8f87b73c65b78d8f71a157302.
As pointed out by Naoya Anzai, we need to do more work to make that
idea handle end-of-index cases, and it is looking like too much risk
for a back-patch. So bug #6278 is only going to be fixed in HEAD.
Tom Lane [Tue, 1 Nov 2011 23:49:01 +0000 (19:49 -0400)]
Fix race condition with toast table access from a stale syscache entry.
If a tuple in a syscache contains an out-of-line toasted field, and we
try to fetch that field shortly after some other transaction has committed
an update or deletion of the tuple, there is a race condition: vacuum
could come along and remove the toast tuples before we can fetch them.
This leads to transient failures like "missing chunk number 0 for toast
value NNNNN in pg_toast_2619", as seen in recent reports from Andrew
Hammond and Tim Uckun.
The design idea of syscache is that access to stale syscache entries
should be prevented by relation-level locks, but that fails for at least
two cases where toasted fields are possible: ANALYZE updates pg_statistic
rows without locking out sessions that might want to plan queries on the
same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without
any meaningful lock at all.
The least risky fix seems to be an idea that Heikki suggested when we
were dealing with a related problem back in August: forcibly detoast any
out-of-line fields before putting a tuple into syscache in the first place.
This avoids the problem because at the time we fetch the parent tuple from
the catalog, we should be holding an MVCC snapshot that will prevent
removal of the toast tuples, even if the parent tuple is outdated
immediately after we fetch it. (Note: I'm not convinced that this
statement holds true at every instant where we could be fetching a syscache
entry at all, but it does appear to hold true at the times where we could
fetch an entry that could have a toasted field. We will need to be a bit
wary of adding toast tables to low-level catalogs that don't have them
already.) An additional benefit is that subsequent uses of the syscache
entry should be faster, since they won't have to detoast the field.
Back-patch to all supported versions. The problem is significantly harder
to reproduce in pre-9.0 releases, because of their willingness to flush
every entry in a syscache whenever the underlying catalog is vacuumed
(cf CatalogCacheFlushRelation); but there is still a window for trouble.
Tom Lane [Mon, 31 Oct 2011 20:40:27 +0000 (16:40 -0400)]
Stop btree indexscans upon reaching nulls in either direction.
The existing scan-direction-sensitive tests were overly complex, and
failed to stop the scan in cases where it's perfectly legitimate to do so.
Per bug #6278 from Maksym Boguk.
Back-patch to 8.3, which is as far back as the patch applies easily.
Doesn't seem worth sweating over a relatively minor performance issue in
8.2 at this late date. (But note that this was a performance regression
from 8.1 and before, so 8.2 is being left as an outlier.)
Tom Lane [Sat, 29 Oct 2011 18:31:12 +0000 (14:31 -0400)]
Fix assorted bogosities in cash_in() and cash_out().
cash_out failed to handle multiple-byte thousands separators, as per bug
#6277 from Alexander Law. In addition, cash_in didn't handle that either,
nor could it handle multiple-byte positive_sign. Both routines failed to
support multiple-byte mon_decimal_point, which I did not think was worth
changing, but at least now they check for the possibility and fall back to
using '.' rather than emitting invalid output. Also, make cash_in handle
trailing negative signs, which formerly it would reject. Since cash_out
generates trailing negative signs whenever the locale tells it to, this
last omission represents a fail-to-reload-dumped-data bug. IMO that
justifies patching this all the way back.
Tom Lane [Wed, 26 Oct 2011 17:02:53 +0000 (13:02 -0400)]
Change FK trigger creation order to better support self-referential FKs.
When a foreign-key constraint references another column of the same table,
row updates will queue both the PK's ON UPDATE action and the FK's CHECK
action in the same event. The ON UPDATE action must execute first, else
the CHECK will check a non-final state of the row and possibly throw an
inappropriate error, as seen in bug #6268 from Roman Lytovchenko.
Now, the firing order of multiple triggers for the same event is determined
by the sort order of their pg_trigger.tgnames, and the auto-generated names
we use for FK triggers are "RI_ConstraintTrigger_NNNN" where NNNN is the
trigger OID. So most of the time the firing order is the same as creation
order, and so rearranging the creation order fixes it.
This patch will fail to fix the problem if the OID counter wraps around or
adds a decimal digit (eg, from 99999 to 100000) while we are creating the
triggers for an FK constraint. Given the small odds of that, and the low
usage of self-referential FKs, we'll live with that solution in the back
branches. A better fix is to change the auto-generated names for FK
triggers, but it seems unwise to do that in stable branches because there
may be client code that depends on the naming convention. We'll fix it
that way in HEAD in a separate patch.
Back-patch to all supported branches, since this bug has existed for a long
time.
Back-patch to 8.3. The 8.2 code is rather different in this area, and it
doesn't seem worth any risk to fix a corner case that nobody has stumbled
on before.
Tom Lane [Sat, 15 Oct 2011 00:24:50 +0000 (20:24 -0400)]
Fix bugs in information_schema.referential_constraints view.
This view was being insufficiently careful about matching the FK constraint
to the depended-on primary or unique key constraint. That could result in
failure to show an FK constraint at all, or showing it multiple times, or
claiming that it depended on a different constraint than the one it really
does. Fix by joining via pg_depend to ensure that we find only the correct
dependency.
Back-patch, but don't bump catversion because we can't force initdb in back
branches. The next minor-version release notes should explain that if you
need to fix this in an existing installation, you can drop the
information_schema schema then re-create it by sourcing
$SHAREDIR/information_schema.sql in each database (as a superuser of
course).
Tom Lane [Wed, 12 Oct 2011 17:59:30 +0000 (13:59 -0400)]
Improve documentation of psql's \q command.
The documentation neglected to explain its behavior in a script file
(it only ends execution of the script, not psql as a whole), and failed
to mention the long form \quit either.
Don't let transform_null_equals=on affect CASE foo WHEN NULL ... constructs.
transform_null_equals is only supposed to affect "foo = NULL" expressions
given directly by the user, not the internal "foo = NULL" expression
generated from CASE-WHEN.
This fixes bug #6242, reported by Sergey. Backpatch to all supported
branches.
Robert Haas [Thu, 6 Oct 2011 16:08:59 +0000 (12:08 -0400)]
Make pgstatindex respond to cancel interrupts.
A similar problem for pgstattuple() was fixed in April of 2010 by commit 33065ef8bc52253ae855bc959576e52d8a28ba06, but pgstatindex() seems to have
been overlooked.
Back-patch all the way, as with that commit, though not to 7.4 through
8.1, since those are now EOL.
Tom Lane [Sat, 24 Sep 2011 02:12:36 +0000 (22:12 -0400)]
Fix our mapping of Windows timezones for Central America.
We were mapping "Central America Standard Time" to "CST6CDT", which seems
entirely wrong, because according to the Olson timezone database noplace
in Central America observes daylight savings time on any regular basis ---
and certainly not according to the USA DST rules that are implied by
"CST6CDT". (Mexico is an exception, but they can be disregarded since
they have a separate timezone name in Windows.) So, map this zone name to
plain "CST6", which will provide a fixed UTC offset.
As written, this patch will also result in mapping "Central America
Daylight Time" to CST6. I considered hacking things so that would still
map to CST6CDT, but it seems it would confuse win32tzlist.pl to put those
two names in separate entries. Since there's little evidence that any
such zone name is used in the wild, much less that CST6CDT would be a good
match for it, I'm not too worried about what we do with it.
Tom Lane [Fri, 16 Sep 2011 08:28:11 +0000 (04:28 -0400)]
gistendscan() forgot to free so->giststate.
This oversight led to a massive memory leak --- upwards of 10KB per tuple
--- during creation-time verification of an exclusion constraint based on a
GIST index. In most other scenarios it'd just be a leak of 10KB that would
be recovered at end of query, so not too significant; though perhaps the
leak would be noticeable in a situation where a GIST index was being used
in a nestloop inner indexscan. In any case, it's a real leak of long
standing, so patch all supported branches. Per report from Harald Fuchs.
Tom Lane [Wed, 7 Sep 2011 21:06:39 +0000 (17:06 -0400)]
Fix corner case bug in numeric to_char().
Trailing-zero stripping applied by the FM specifier could strip zeroes
to the left of the decimal point, for a format with no digit positions
after the decimal point (such as "FM999.").
Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
Tom Lane [Tue, 6 Sep 2011 18:50:28 +0000 (14:50 -0400)]
Avoid possibly accessing off the end of memory in SJIS2004 conversion.
The code in shift_jis_20042euc_jis_2004() would fetch two bytes even when
only one remained in the string. Since conversion functions aren't
supposed to assume null-terminated input, this poses a small risk of
fetching past the end of memory and incurring SIGSEGV. No such crash has
been identified in the field, but we've certainly seen the equivalent
happen in other code paths, so patch this one all the way back.
Tom Lane [Tue, 6 Sep 2011 18:35:55 +0000 (14:35 -0400)]
Avoid possibly accessing off the end of memory in examine_attribute().
Since the last couple of columns of pg_type are often NULL,
sizeof(FormData_pg_type) can be an overestimate of the actual size of the
tuple data part. Therefore memcpy'ing that much out of the catalog cache,
as analyze.c was doing, poses a small risk of copying past the end of
memory and incurring SIGSEGV. No such crash has been identified in the
field, but we've certainly seen the equivalent happen in other code paths,
so patch this one all the way back.
Per valgrind testing by Noah Misch, though this is not his proposed patch.
I chose to use SearchSysCacheCopy1 rather than inventing special-purpose
infrastructure for copying only the minimal part of a pg_type tuple.
Tom Lane [Tue, 6 Sep 2011 16:14:51 +0000 (12:14 -0400)]
Update type-conversion documentation for long-ago changes.
This example wasn't updated when we changed the behavior of bpcharlen()
in 8.0, nor when we changed the number of parameters taken by the bpchar()
cast function in 7.3. Per report from lsliang.
Tom Lane [Sat, 3 Sep 2011 20:17:57 +0000 (16:17 -0400)]
Fix typo in pg_srand48 (srand48 in older branches).
">" should be ">>". This typo results in failure to use all of the bits
of the provided seed.
This might rise to the level of a security bug if we were relying on
srand48 for any security-critical purposes, but we are not --- in fact,
it's not used at all unless the platform lacks srandom(), which is
improbable. Even on such a platform the exposure seems minimal.
Move the line to undefine setlocale() macro on Win32 outside USE_REPL_SNPRINTF
ifdef block. It has nothing to do with whether the replacement snprintf
function is used. It caused no live bug, because the replacement snprintf
function is always used on Win32, but it was nevertheless misplaced.
The version of this macro used in autoconf 2.59 is capable of incorrectly
succeeding (ie, reporting that a library function is available when it
isn't), if the compiler performs link-time optimization and decides that
it can optimize the function reference away entirely. Replace it with the
coding used in autoconf 2.61 and later, which forces the program result to
depend on the function's result so that it cannot be optimized away. This
should fix build failures currently being seen on buildfarm member anchovy.
This patch affects the 8.2 and 8.3 branches only, since later branches are
using autoconf versions that don't have this problem.
Tom Lane [Sat, 27 Aug 2011 20:37:17 +0000 (16:37 -0400)]
Don't assume that "E" response to NEGOTIATE_SSL_CODE means pre-7.0 server.
These days, such a response is far more likely to signify a server-side
problem, such as fork failure. Reporting "server does not support SSL"
(in sslmode=require) could be quite misleading. But the results could
be even worse in sslmode=prefer: if the problem was transient and the
next connection attempt succeeds, we'll have silently fallen back to
protocol version 2.0, possibly disabling features the user needs.
Hence, it seems best to just eliminate the assumption that backing off
to non-SSL/2.0 protocol is the way to recover from an "E" response, and
instead treat the server error the same as we would in non-SSL cases.
I tested this change against a pre-7.0 server, and found that there
was a second logic bug in the "prefer" path: the test to decide whether
to make a fallback connection attempt assumed that we must have opened
conn->ssl, which in fact does not happen given an "E" response. After
fixing that, the code does indeed connect successfully to pre-7.0,
as long as you didn't set sslmode=require. (If you did, you get
"Unsupported frontend protocol", which isn't completely off base
given the server certainly doesn't support SSL.)
Since there seems no reason to believe that pre-7.0 servers exist anymore
in the wild, back-patch to all supported branches.
Tom Lane [Sat, 27 Aug 2011 18:16:35 +0000 (14:16 -0400)]
Ensure we discard unread/unsent data when abandoning a connection attempt.
There are assorted situations wherein PQconnectPoll() will abandon a
connection attempt and try again with different parameters (eg, SSL versus
not SSL). However, the code forgot to discard any pending data in libpq's
I/O buffers when doing this. In at least one case (server returns E
message during SSL negotiation), there is unread input data which bollixes
the next connection attempt. I have not checked to see whether this is
possible in the other cases where we close the socket and retry, but it
seems like a matter of good defensive programming to add explicit
buffer-flushing code to all of them.
This is one of several issues exposed by Daniel Farina's report of
misbehavior after a server-side fork failure.
This has been wrong since forever, so back-patch to all supported branches.
Tom Lane [Fri, 26 Aug 2011 20:51:57 +0000 (16:51 -0400)]
Fix potential memory clobber in tsvector_concat().
tsvector_concat() allocated its result workspace using the "conservative"
estimate of the sum of the two input tsvectors' sizes. Unfortunately that
wasn't so conservative as all that, because it supposed that the number of
pad bytes required could not grow. Which it can, as per test case from
Jesper Krogh, if there's a mix of lexemes with positions and lexemes
without them in the input data. The fix is to assume that we might add
a not-previously-present pad byte for each and every lexeme in the two
inputs; which really is conservative, but it doesn't seem worthwhile to
try to be more precise.
This is an aboriginal bug in tsvector_concat, so back-patch to all
versions containing it.
Tom Lane [Thu, 25 Aug 2011 03:50:31 +0000 (23:50 -0400)]
Fix pgstatindex() to give consistent results for empty indexes.
For an empty index, the pgstatindex() function would compute 0.0/0.0 for
its avg_leaf_density and leaf_fragmentation outputs. On machines that
follow the IEEE float arithmetic standard with any care, that results in
a NaN. However, per report from Rushabh Lathia, Microsoft couldn't
manage to get this right, so you'd get a bizarre error on Windows.
Fix by forcing the results to be NaN explicitly, rather than relying on
the division operator to give that or the snprintf function to print it
correctly. I have some doubts that this is really the most useful
definition, but it seems better to remain backward-compatible with
those platforms for which the behavior wasn't completely broken.
Back-patch to 8.2, since the code is like that in all current releases.
Tom Lane [Sat, 20 Aug 2011 18:51:02 +0000 (14:51 -0400)]
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can
make building a large tidbitmap into an O(N^2) operation. To fix, make
sure we remove more than the minimum amount of information per call, and
add a fallback path to behave sanely if we're unable to fit the bitmap
within the requested amount of memory.
This has been wrong since the tidbitmap code was written, so back-patch
to all supported branches.
Tom Lane [Tue, 16 Aug 2011 17:12:23 +0000 (13:12 -0400)]
Fix race condition in relcache init file invalidation.
The previous code tried to synchronize by unlinking the init file twice,
but that doesn't actually work: it leaves a window wherein a third process
could read the already-stale init file but miss the SI messages that would
tell it the data is stale. The result would be bizarre failures in catalog
accesses, typically "could not read block 0 in file ..." later during
startup.
Instead, hold RelCacheInitLock across both the unlink and the sending of
the SI messages. This is more straightforward, and might even be a bit
faster since only one unlink call is needed.
This has been wrong since it was put in (in 2002!), so back-patch to all
supported releases.
Tom Lane [Thu, 28 Jul 2011 18:07:23 +0000 (14:07 -0400)]
Fix pg_restore's direct-to-database mode for standard_conforming_strings.
pg_backup_db.c contained a mini SQL lexer with which it tried to identify
boundaries between SQL commands, but that code was not designed to cope
with standard_conforming_strings, and would get the wrong answer if a
backslash immediately precedes a closing single quote in such a string,
as per report from Julian Mehnle. The bug only affects direct-to-database
restores from archive files made with standard_conforming_strings = on.
Rather than complicating the code some more to try to fix that, let's just
rip it all out. The only reason it was needed was to cope with COPY data
embedded into ordinary archive entries, which was a layout that was used
only for about the first three weeks of the archive format's existence,
and never in any production release of pg_dump. Instead, just rely on the
archive file layout to tell us whether we're printing COPY data or not.
This bug represents a data corruption hazard in all releases in which
standard_conforming_strings can be turned on, ie 8.2 and later, so
back-patch to all supported branches.
Tom Lane [Mon, 25 Jul 2011 03:29:27 +0000 (23:29 -0400)]
Fix previous patch so it also works if not USE_SSL (mea culpa).
On balance, the need to cover this case changes my mind in favor of pushing
all error-message generation duties into the two fe-secure.c routines.
So do it that way.
Tom Lane [Sun, 24 Jul 2011 20:29:30 +0000 (16:29 -0400)]
Improve libpq's error reporting for SSL failures.
In many cases, pqsecure_read/pqsecure_write set up useful error messages,
which were then overwritten with useless ones by their callers. Fix this
by defining the responsibility to set an error message to be entirely that
of the lower-level function when using SSL.
Back-patch to 8.3; the code is too different in 8.2 to be worth the
trouble.
Tom Lane [Sun, 24 Jul 2011 19:18:12 +0000 (15:18 -0400)]
Use OpenSSL's SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER flag.
This disables an entirely unnecessary "sanity check" that causes failures
in nonblocking mode, because OpenSSL complains if we move or compact the
write buffer. The only actual requirement is that we not modify pending
data once we've attempted to send it, which we don't. Per testing and
research by Martin Pihlak, though this fix is a lot simpler than his patch.
I put the same change into the backend, although it's less clear whether
it's necessary there. We do use nonblock mode in some situations in
streaming replication, so seems best to keep the same behavior in the
backend as in libpq.
Magnus Hagander [Sat, 16 Jul 2011 17:58:53 +0000 (19:58 +0200)]
Fix SSPI login when multiple roundtrips are required
This fixes SSPI login failures showing "The function
requested is not supported", often showing up when connecting
to localhost. The reason was not properly updating the SSPI
handle when multiple roundtrips were required to complete the
authentication sequence.
Report and analysis by Ahmed Shinwari, patch by Magnus Hagander
Fix two ancient bugs in GiST code to re-find a parent after page split:
First, when following a right-link, we incorrectly marked the current page
as the parent of the right sibling. In reality, the parent of the right page
is the same as the parent of the current page (or some page to the right of
it, gistFindCorrectParent() will sort that out).
Secondly, when we follow a right-link, we must prepend, not append, the right
page to our list of pages to visit. That's because we assume that once we
hit a leaf page in the list, all the rest are leaf pages too, and give up.
To hit these bugs, you need concurrent actions and several unlucky accidents.
Another backend must split the root page, while you're in process of
splitting a lower-level page. Furthermore, while you scan the internal nodes
to re-find the parent, another backend needs to again split some more internal
pages. Even then, the bugs don't necessarily manifest as user-visible errors
or index corruption.
While we're at it, make the error reporting a bit better if gistFindPath()
fails to re-find the parent. It used to be an assertion, but an elog() seems
more appropriate.
Tom Lane [Tue, 5 Jul 2011 16:04:40 +0000 (12:04 -0400)]
Fix psql's counting of script file line numbers during COPY.
handleCopyIn incremented pset.lineno for each line of COPY data read from
a file. This is correct when reading from the current script file (i.e.,
we are doing COPY FROM STDIN followed by in-line data), but it's wrong if
the data is coming from some other file. Per bug #6083 from Steve Haslam.
Back-patch to all supported versions.
Tom Lane [Sun, 3 Jul 2011 20:40:34 +0000 (16:40 -0400)]
Back-patch creation of tar.bz2 tarball during "make dist".
Since commit a4d03bbcdaf7739d7e9073ee76bb186f68ddc163, "make dist" has
built both gzip- and bzip2-compressed tarballs. However, this was
pretty useless, because our tarball build script didn't know about it
and proceeded to overwrite the bz2 file with new data. Back-patch the
change to all active branches, so that creation of the tar.bz2 file
can be removed from the build script.
Tom Lane [Tue, 21 Jun 2011 18:41:05 +0000 (14:41 -0400)]
Apply upstream fix for blowfish signed-character bug (CVE-2011-2483).
A password containing a character with the high bit set was misprocessed
on machines where char is signed (which is most). This could cause the
preceding one to three characters to fail to affect the hashed result,
thus weakening the password. The result was also unportable, and failed
to match some other blowfish implementations such as OpenBSD's.
Since the fix changes the output for such passwords, upstream chose
to provide a compatibility hack: password salts beginning with $2x$
(instead of the usual $2a$ for blowfish) are intentionally processed
"wrong" to give the same hash as before. Stored password hashes can
thus be modified if necessary to still match, though it'd be better
to change any affected passwords.
In passing, sync a couple other upstream changes that marginally improve
performance and/or tighten error checking.
Back-patch to all supported branches. Since this issue is already
public, no reason not to commit the fix ASAP.
Tom Lane [Fri, 17 Jun 2011 23:13:21 +0000 (19:13 -0400)]
Don't use "cp -i" in the example WAL archive_command.
This is a dangerous example to provide because on machines with GNU cp,
it will silently do the wrong thing and risk archive corruption. Worse,
during the 9.0 cycle somebody "improved" the discussion by removing the
warning that used to be there about that, and instead leaving the
impression that the command would work as desired on most Unixen.
It doesn't. Try to rectify the damage by providing an example that is safe
most everywhere, and then noting that you can try cp -i if you want but
you'd better test that.
In back-patching this to all supported branches, I also added an example
command for Windows, which wasn't provided before 9.0.
Tom Lane [Fri, 17 Jun 2011 22:19:26 +0000 (18:19 -0400)]
Obtain table locks as soon as practical during pg_dump.
For some reason, when we (I) added table lock acquisition to pg_dump,
we didn't think about making it happen as soon as possible after the
start of the transaction. What with subsequent additions, there was
actually quite a lot going on before we got around to that; which sort
of defeats the purpose. Rearrange the order of calls in dumpSchema()
to close the risk window as much as we easily can. Back-patch to all
supported branches.
Robert Haas [Fri, 17 Jun 2011 18:28:45 +0000 (14:28 -0400)]
Add overflow checks to int4 and int8 versions of generate_series().
The previous code went into an infinite loop after overflow. In fact,
an overflow is not really an error; it just means that the current
value is the last one we need to return. So, just arrange to stop
immediately when overflow is detected.
Tom Lane [Tue, 14 Jun 2011 21:14:06 +0000 (17:14 -0400)]
Suppress -arch switches in the output of ExtUtils::Embed.
We previously found out that OS X's standard perl installation tries to put
-arch switches into Perl link commands, evidently in hopes of building
universal binaries. But it doesn't work to add such switches in plperl's
link step if they weren't being used earlier, so this is basically
unworkable. When using gcc the result is only some warnings; but LLVM
fails entirely, so this issue isn't as cosmetic as we originally thought.
Hence, back-patch commit d69a419e682c2d39c2355105a7e5e2b90357c8f0 into
pre-9.0 branches.
Tom Lane [Tue, 14 Jun 2011 20:24:45 +0000 (16:24 -0400)]
Fix assorted issues with build and install paths containing spaces.
Apparently there is no buildfarm critter exercising this case after all,
because it fails in several places. With this patch, build, install,
check-world, and installcheck-world pass for me on OS X.
Tom Lane [Fri, 10 Jun 2011 21:03:21 +0000 (17:03 -0400)]
Work around gcc 4.6.0 bug that breaks WAL replay.
ReadRecord's habit of using both direct references to tmpRecPtr and
references to *RecPtr (which is pointing at tmpRecPtr) triggers an
optimization bug in gcc 4.6.0, which apparently has forgotten about
aliasing rules. Avoid the compiler bug, and make the code more readable
to boot, by getting rid of the direct references. Improve the comments
while at it.
Back-patch to all supported versions, in case they get built with 4.6.0.
Tom Lane, with some cosmetic suggestions from Alex Hunsaker